A performance analysis framework for SOCP algorithms in noisy 

compressed sensing 



MlHAILO STOJNIC 



School of Industrial Engineering 



o 

CsJ ■ Purdue University, West Lafayette, IN 47907 

%-t . e-mail: mstojnic@purdue . edu 

0\ . Abstract 

(N 

Solving under-determined systems of linear equations with sparse solutions attracted enormous amount 

of attention in recent years, above all, due to work of [12, 13, 26]. In [12, 13, 26] it was rigorously shown 
for the first time that in a statistical and large dimensional context a linear sparsity can be recovered from 
Y^ I an under-determined system via a simple polynomial ^i -optimization algorithm. [13] went even further and 

established that in noisy systems for any linear level of under-determinedness there is again a linear sparsity 
that can be approximately recovered through an SOCP (second order cone programming) noisy equivalent to 
^ i li. Moreover, the approximate solution is (in an ^2-iiorm sense) guaranteed to be no further from the sparse 

C^ I unknown vector than a constant times the noise. In this paper we will also consider solving noisy linear 

systems and present an alternative statistical framework that can be used for their analysis. To demonstrate 

O ■ how the framework works we will show how one can use it to precisely characterize the approximation 

T^lj- \ error of a wide class of SOCP algorithms. We will also show that our theoretical predictions are in a solid 

O ■ agrement with the results one can get through numerical simulations. 



Index Terms: Noisy systems of linear equations; SOCP; ^i -optimization; compressed sensing. 
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^ : 1 Introduction 



In this paper we focus on studying mathematical properties of under-determined systems of linear equations 
with sparse solutions (studying these systems from both, theoretical and practical point of view attracted 
enormous attention in recent years, see, e.g. [4, 10, 14,22,31,46,50,54,56-58,71,73] and references therein). 
In its simplest form solving an under-determined system of linear equations amounts to finding a, say, k- 
sparse x such that 

^x = y (1) 

where A is an m x n (m < n) matrix and y is an m x 1 vector (see Figure 1 ; here and in the rest of the 
paper, under /c-sparse vector we assume a vector that has at most k nonzero components). Of course, the 
assumption will be that such an x exists. To make writing in the rest of the paper easier, we will assume the 
so-called linear regime, i.e. we will assume that k = f3n and that the number of equations is ?n, = an where 
a and /3 are constants independent of n (more on the non-linear regime, i.e. on the regime when m is larger 
than linearly proportional to k can be found in e.g. [21, 35, 36]). 

If one has freedom to design matrix A then the results from [2,47, 53] demonstrated that the techniques 
from coding theory (based on the coding/decoding of Reed-Solomon codes) can be employed to determine 
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Figure 1 : Model of a linear system; vector x is fc-sparse 

any A;-sparse x in (1) for any < a < 1 and any /3 < ^ in polynomial time. It is relatively easy to show 
that under the unique recoverability assumption /3 can not be greater than ^. Therefore, as long as one is 
concerned with the unique recovery of /c-sparse x in (1) in polynomial time the results from [2,47,53] are 
optimal. The complexity of algorithms from [2,47,53] is roughly 0{n^). In a similar fashion one can, 
instead of using coding/decoding techniques associated with Reed/Solomon codes, design the matrix and 
the corresponding recovery algorithm based on the techniques related to the coding/decoding of Expander 
codes (see e.g. [42,43,74] and references therein). In that case recovering x in (1) is significantly faster 
for large dimensions n. Namely, the complexity of the techniques from e.g. [42, 43, 74] (or their slight 
modifications) is usually 0{n) which is clearly for large n significantly smaller than O(n^). However, the 
techniques based on coding/decoding of Expander codes usually do not allow for (3 to be as large as ^. 

On the other hand, if one has no freedom in choice of A designing the algorithms to find A;-sparse 
X in (1) is substantially harder. In fact, when there is no choice in A the recovery problem (1) becomes 
NP-hard. Two algorithms 1) Orthogonal matching pursuit - OMP and 2) Basis pursuit - ii-optimization 
(and their different variations) have been often viewed as solid heuristics for solving (1) (in recent years 
belief propagation type of algorithms are emerging as strong alternatives as well). Roughly speaking, OMP 
algorithms are faster but can recover smaller sparsity whereas the BP ones are slower but recover higher 
sparsity. In a more precise way, under certain probabilistic assumptions on the elements of A it can be 
shown (see e.g. [52, 68, 69]) that if ?n, = 0{klog{n)) OMP (or a slightly modified OMP) can recover x 
in (I) with complexity of recovery 0{n'^). On the other hand a stage-wise OMP from [30] recovers x in 
(I) with complexity of recovery 0{n log n). Somewhere in between OMP and BP are recent improvements 
CoSAMP (see e.g. [51]) and Subspace pursuit (see e.g. [23]), which guarantee (assuming the linear regime) 
that the fc-sparse x in (1) can be recovered in polynomial time with m = 0{k) equations. This is the same 
performance guarantee established in [13,26] for the BP. 

We now introduce the BP concept (or, as we will refer to it, the ^i -optimization concept; a slight 
modification/adaptation of it will actually be the main topic of this paper). Variations of the standard 
^1 -optimization from e.g. [15, 19, 61] as well as those from [25, 33, 38^0, 60] related to £g -optimization, 
< q < 1 are possible as well; moreover they can all be incorporated in what we will present below. The 
^1 -optimization concept suggests that one can maybe find the /c-sparse x in (1) by solving the following 
^i-norm minimization problem 



mm ||x||i 
subject to Ax. = y. 



(2) 



As is then shown in [13] if a and n are given, A is given and satisfies the restricted isometry property (RIP) 
(more on this property the interested reader can find in e.g. [1, 5, 11-13, 59]), then any unknown vector x 



with no more than k = fin (where /3 is a constant dependent on a and explicitly calculated in [13]) non-zero 
elements can indeed be recovered by solving (2). In a statistical and large dimensional context in [26] and 
later in [65] for any given value of /3 the exact value of the maximum possible a was determined. 

As we mentioned earlier the above scenario is in a sense idealistic. Namely, it assumes that y in (2) was 
obtained through (1). On other hand in many applications only a noisy version of Ax may be available for 
y (this is especially so in measuring type of applications) see, e.g. [13,41,72]. When that happens one has 
the following equivalent to (1) (see. Figure 2) 



Ax + v, 



(3) 



where v is an m x 1 so-called noise vector (the so-called ideal case presented above is of course a special case 
of the noisy one given in (3)). Finding the A;-sparse x in (3) is now incredibly hard, in fact it is pretty much 
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Figure 2: Model of a linear system; vector x is A;-sparse 

impossible. Basically, one is looking for a A;-sparse x such that (3) holds and on top of that v is unknown. 
Although the problem is hard there are various heuristics throughout the literature that one can use to solve 
it approximately. Majority of these heuristics are based on appropriate generalizations of the corresponding 
algorithms one would use in the noiseless case. Thinking along the same hnes as in the noiseless case one 
can distinguish two scenarios depending on the availability of the freedom to choose/design A. If one has the 
freedom to design A then one can adapt the corresponding noiseless algorithms to the noisy scenario as well 
(more on this can be found in e.g. [7]). However, in this paper we mostly focus on the scenario where one has 
no control over A. In such a scenario one can again make a parallel to the noiseless case and distinguish two 
groups of algorithms that were historically viewed as good heuristics for finding approximate solutions to 
noisy under-determined systems: 1) Generalizations ofOMP and 2) Generalizations ofBP. Among various 
generalizations of OMP we briefly focus only on the following three that we think had a significant impact 
on the field in recent years. Namely, an improvement of standard OMP called ROMP introduced in [52] 
can be proven to work well in the noisy case as well. The same is true for CoSAMP from [51] or Subspace 
pursuit from [24]. Essentially, in a statistical context, the latter two (the one from [52] has a slightly worse 
performance guarantee) can provably recover a linear spars ity while maintaining the approximation error 
proportional to the norm-2 of the noise vector. These algorithms are very successful in quick recovery of 
linear sparsity of certain level. In the noiseless case, all of them can be thought of as perfected versions of 
OMP. Given their robustness with respect to the noise one can think of them as perfected noisy versions of 
OMP as well. 

In this paper we will focus on the second group of algorithms, i.e. we will focus on generalizations of 
BP that can handle the noisy case. To introduce a bit or tractability in finding the /c-sparse x in (3) one 
usually assumes certain amount of knowledge about either x or v. As far as tractability assumptions on v 
are concerned one typically (and possibly fairly reasonably in applications of interest) assumes that ||v||2 is 



bounded (or highly likely to be bounded) from above by a certain known quantity. The following second- 
order cone programming (SOCP) analogue to (or say noisy generalization of) (2) is one of the approaches 
that utilizes such an assumption (more on this approach and its variations can be found in e.g. [13]) 

mill ||x||i 

X 

subject to ||y — Ax||2 < r (4) 

where, r is a quantity such that ||v||2 < r (or r is a quantity such that ||v||2 < r is say highly likely). 
For example, in [13] a statistical context is assumed and based on the statistics of v, r was chosen such 
that ||v||2 < r happens with overwhelming probability (as usual, under overwhelming probability we in 
this paper assume a probability that is no more than a number exponentially decaying in n away from 1). 
Given that (4) is now among few almost standard choices when it comes to finding an approximation to the 
fc-sparse x in (3), the literature on its properties when applied in various contexts is vast (see, e.g. [13,29,67] 
and references therein). We here briefly mention only what we consider to be the most influential work on 
this topic in recent years. Namely, in [13] the authors analyzed the performance of (4) and showed a result 
similar in flavor to the one that holds in the ideal - noiseless - case. In a nutshell the following was shown 
in [13]: let x be a /3n-sparse vector such that (3) holds and let li-socp be the solution of (4). Then 

< Cr (5) 

where /3 is a constant independent of n and C is a constant independent of n and of course dependent on a 
and /3. This result in a sense establishes a noisy equivalent to the fact that a linear sparsity can be recovered 
from an under-determined system of linear equations. In an informal language, it states that a linear sparsity 
can be approximately recovered in polynomial time from a noisy under-determined system with the norm of 
the recovery error guaranteed to be within a constant multiple of the noise norm (as mentioned above, the 
same was also established later in [51] for CoSAMP and in [24] for Subspace pursuit). Establishing such a 
result is, of course, a feat in its own class, not only because of its technical contribution but even more so 
because of the amount of interest that it generated in the field. 

In this paper we will also consider an approximate recovery of the fc-sparse x in (3). Moreover, we will 
also focus on the SOCP algorithms defined in (4). We will develop a novel framework for performance 
characterization of these algorithms. Among other things, in a statistical context, the framework will enable 
us to precisely characterize their approximation error. 

We should also mention that SOCP algorithms are by no means the only possible generalizations (adap- 
tations) of i-i optimization to the noisy case. For example, LASSO algorithms (more on these algorithms 
can be found in e.g. [9, 17, 18,49,66,70] as well as in recent developments [6,27,62]) are a very successful 
alternative. In our recent work [62] we established a nice connection between some of the algorithms from 
the LASSO group and certain SOCP algorithms. Towards the end of the present paper we will revisit that 
connection and provide a few additional insights. Another interesting alternative to the SOCP or the LASSO 
algorithms is the so-called Dantzig selector introduced in [16] (more on the Dantzig selector as well as on its 
relation to the LASSO algorithms can be found in e.g. [3,8,32,34,44,45,48]). In the nutshell, LASSO and 
SOCP algorithms are likely to provide a better recovery performance than the Dantzig selector in a variety 
of scenarios and with respect to a variety of performance measures whereas the Dantzig selector as a linear 
program promises to be faster. Of course a fair comparison would go way beyond this short observation; 
especially so with a plenty of room for improvement in numerical implementations specifically tailored for 
linear programs such as the Dantzig selector or with the recent development of fast belief propagation type 
of LASSO-like implementations (see, e.g. [6,27]). 

Before we proceed further we briefly summarize the organization of the rest of the paper. In Section 2, 
we present a statistical framework for the performance analysis of the SOCP algorithms. To demonstrate 



its power we towards the end of Section 2, for any given a and /?, compute the worst case approximation 
error that (4) makes when used for approximate recovery of general sparse vectors x from (3). In Section 
3 we then specialize results from Section 2 to the so-called signed vectors x. In Section 4 we will revisit 
a connection between the SOCP algorithms and the LASSO alternatives. Finally, in Section 5 we discuss 
obtained results. 

2 SOCP's performance analysis framework - general x 

In this section we create a statistical SOCP's performance analysis framework. Before proceeding further 
we will now explicitly state the major assumptions that we will make (the remaining ones will be made 
appropriately throughout the analysis). Namely, in the rest of the paper we will assume that the elements of 
A are i.i.d. standard normal random variables. We will also assume that the elements of v are i.i.d. Gaussian 
random variables with zero mean and variance a. We will assume that x is the original x in (3) that we are 
trying to recover and that it is any fc-sparse vector with a given fixed location of its nonzero elements 
and a given fixed combination of their signs. Since the analysis (and the performance of (4)) will clearly 
be irrelevant with respect to what particular location and what particular combination of signs of nonzero 
elements are chosen, we can for the simplicity of the exposition and without loss of generality assume that 
the components xi, X2, . . . , x„_fc of x are equal to zero and the components x„_fc_|_i, x„_fc_|_2, • • • , x„ of 
X are greater than or equal to zero. Moreover, throughout the paper we will call such an x /c-sparse and 
positive. In a more formal way we will set 

Xi = X2 = • • • = X„__fc = 

x„-fc+i > 0, x„_fc+i > 0, . . . , x„ > 0. (6) 

We also now take the opportunity to point out a rather obvious detail. Namely, the fact that x is positive is 
assumed for the purpose of the analysis. However, this fact is not known a priori and is not available to the 
solving algorithm (this will of course change in Section 3). 

Once we establish the framework it will be clear that it can be used to characterize many of the SOCP 
features. We will defer these details to a collection of forthcoming papers. However in this paper we will 
demonstrate a small application that relates to a classical question of determining the approximation error 
that (4) makes when used to recover any /c-sparse x that satisfies (3) and is from a set of x's with a given fixed 
location of nonzero elements and a given fixed combination of their signs. The approximation error that we 
will focus on will be the norm-2 of the error vector, (one can of course characterize the approximation error 
in many other ways; for example one such a way that attracted a lot of attention in recent years is the so 
called error in the support recovery; more in this direction can be found in e.g. [72] or in e.g. [9, 45] when 
one is not necessarily concerned with the SOCP type of algorithms). 

Before proceeding further we will introduce a few definitions that will be useful in formalizing the above 
mentioned application as well as in conducting the entire analysis. As it is natural we start with the solution 
of (4). As earlier, let y^soc-p be the solution of (4) and further let w^ocp G ^" be such that 

"^socp — X + ^^ socp' \l) 

As mentioned above, as an application of our framework we will compute the largest possible value of 
||xsocp — x||2 = llwsocplb for any combination (a, /5). Or more rigorously, for any combination (a, /3), we 
will find a dgocp such that 

lim P{dsocp - e < max Hw^ocplb < dgocp + e) = 1 (8) 



for an arbitrarily small constant e. However, before doing so in the following three subsections we will 
present the general framework. Towards the end of the third subsection and in the fourth one we will then 
demonstrate how it can be used to determine the dgocp- 

The framework that we will present below will center around the optimal value of the objective function 
in (4) (of course in a probabilistic context). We will divide presentation in several subsections. In the first 
one we will compute a "high-probability" upper bound on the value of that objective. In the second one we 
will then show how one can design a mechanism to obtain a "high-probability" lower bound on the optimal 
value of (4). In later subsections we will show that the two bounds can match each other. Now, before we 
start the technical details we will rewrite (4) in the following way 

mill ||x||i — ||x||i 
subject to ||y - Ax\\2 < rgocp- (9) 

One should note that this modification of (4) is for the analysis purposes only, i.e. (9) is not the algorithm 
one would be running in the search of an approximation to x ((9) can not be run anyway, since it requires 
knowledge of ||x||i which is of course unavailable). The SOCP algorithm one would actually use to find 
an approximation to x is the one in (4). It is just for the easiness of exposition that we will look at the 
modification (9) and not at the original problem (4). Also, one should note that r in (4) or rsocp in (9) is a 
parameter that critically impacts the outcome of any SOCP type of algorithm (in fact for different r's one 
will have different SOCP's). The analysis that we will present assumes a general r that we will call rsocp- 
We will of course later in the paper (basically when the analysis is done) comment in more detail on the 
effect that choice of rsocp has on the analysis or more importantly on the performance of the optimization 
algorithm from (4). 

Given that we will be dealing with (9) let us define the optimal value of its objective in the following 
way 

fobj{cr,^,A,v, rsocp) = mill ||x||i - ||x||i 

subject to ||y - Ax||2 < rsocp- (10) 

To make writing easier we will instead of fobj (o", x, A, v, rsocp) write just fobj- A similar convention will be 
applied to few other functions throughout the paper. On many occasions, though, (especially where we deem 
it as substantial to the derivation) we will also keep all (of a majority of) arguments of the corresponding 
functions. 

2.1 Upper- bounding fobj 

In this section we present a general framework for finding a "high-probability" upper bound on fobj- We 
start by noting that if one knows that y = ^x + v holds then (10) can be rewritten as 

min ||x||i — ||x||i 

X 

subject to II V + A± — ^x||2 < rsocp- (H) 

After a small change of variables, x = x + w,(ll) becomes 

min ||x + w||i — ||x||i 
subject to ||v — ^w||2 < rsocp, (12) 



or in a more compact form 



inin ||x + w||i — ||x||i 



w 

W 



subject to \\A^ 



a 



1 2 < Tsocp, (13) 



where A-^ = [—A v] is now an tti x (n + 1) random matrix with i.i.d. standard normal components. Now, 
let Cw„p be a positive scalar. Then the optimal value of the objective of the following optimization problem 
is an upper bound on fobj 



mm ||x + w||i — ||x||i 



l^v 



2 _; ^socp 



\Ml < cl^,, (14) 

One can then proceed by solving the above optimization problem through the Lagrange duality. However, 
instead of doing that we recognize that (14) is the same as the first equation in Section 3.2 in [62]. One can 
then repeat all the steps from Section 3.2 in [62] until the last equation before Lemma 6 to obtain 

n 

-fif = - min max ((z^ - 2A(2))^ - ^(i)^)a - ^«va + ||z.«||2r,oep + 2 V A^x, 

subject to < \f^ < 1, 1 < i < n, (15) 

where z*-^) is an n dimensional vector of all ones, A^^^ and u^^> are n and m dimensional vectors of Lagrange 
variables, respectively, and —f^^f is the optimal value of (14). If we can establish a "high-probability" 

lower bound on /^^j' we will have a "high-probability" upper bound on the objective value of (14). To do 
so, we recall on Lemma 6 from [62] (Lemma 6 from [62] is a slightly modified Lemma 3.1 from [37] which 
is the backbone of the escape through a mesh theorem utilized in [65]). 

Lemma 1. Let A be an m x n matrix with i.i.d. standard normal components. Let g and h be m x 1 and 
(n + 1) X 1 vectors, respectively, with i.i.d. standard normal components. Also, let g be a standard normal 
random variable and let Abe a set such that A = (A'-^-'jO < A^ < 1, 1 < z < n). Then 



P( min max (—i^ \A vl 



+ ||z/«||25-V'a,A(2),^(i))>0) 



A(2)gA,!y(i)g_Rm\0 a 2=Cw„„ ^ V '^ ^ 

^ «=1 ^=l 



Let 



V'a,A(2),.a)=4'^^/^lk^'^ll2-a^(z«-2A(2))-||.«|br_,-2 J^ Af i. + ^g^, (H) 

i=n—k+l 

with £3 > being an arbitrarily small constant independent of n and /^^^ being a constant to be specified 



later. The left-hand side of the inequality in (16) is then the following probability of interest 



Pu = P{ min max {\\v'''''' hCP^ \^i + hn+iO") + J C^, + o-^ V g^z^f ^ 

^3) /r||,,(l)|| ^^T((l) 9\(2)\ I ||,,(1)|| „ I 9 \^ \(2) -- N ^ Aup)\ 

i=n—k+l 

After solving the inner maximization over a one has 

Pu = P{ min (Cw„J|||z^«||2h + (z« - 2\^''))h + (V+ia - e^^^ V^)||i.«||2 
A(2)eA,iye_R™\o 



After minimization of the third term over norm ||z^^^^ II2 vector v^^^ we further have 

Pu = P{ min (Cw„J|||j^(^)||2h + (z« - 2\^^^\2 + (h„+ia - e^i^ ^)\\u^^^ h 
A(2)eA,;^e-R"\0 

-^Ci„^ + a2||g|M|.«|b + r_,||.«|b + 2 X; Af5c,)>H?)- (18) 

i=n—k+l 

Now we change variables so that v = ||z^(^) II2 and assume that there is an arbitrarily large constant Cp such 
that V < Cy where v is the solution of the optimization inside probability (using this assumption here will 
not affect substantially the value of the above probability if it eventually turns out that this assumption is 
valid with overwhelming probability; of course, this will turn out to be the case in all scenarios of interest in 
our analysis; strictly speaking from this point on all our overwhelming probabilities should be multiplied by 
a probability that i) < Cy;io make writing less tedious we omit this probability and use strict inequalities). 
Returning back to (18) gives us 

Pu>P{ min (Cw„J|i^h + (z«-2A(2))||2 + (h„+icT-e^^)V^)i/ 
A(2)GA,i^e(0,C^) 



i=n—k+l 



Cl^^ + a^^h^ + rsoap^ + 2 Y. xf%)>Cbf)- (19) 



Since h„_|_i is a standard normal one has P{hn+i(T > —e\ ^/n) > 1 — e '^2 " where e[ > is an 
arbitrarily small constant and €3 is a constant dependent on e^ and a but independent on n. Then from 
(20) we obtain 

Pu > P{ min (Cw„J|i^h + (z(i) - 2A(2))||2 - {ef^ + e'i^)V^u 
A(2)eA,!ye(o,c^) 



(h) 



i=n—k+l 



Set A(2) = {a(2)|0 < Af ^ < 2, 1 < i < n} and 



"^^ A(2)6A(2Ve(o,a)^^""^"'" ' '" '■ '^' '--'"' 



i=n—k+l 

Now, before proceeding further we first recall on tlie following incredible result from [20] related to the 
concentrations of Lipschitz functions of Gaussian random variables. 

Lemma 2 ( [20,55]). Let fhp{-) : i?" — ?■ R be a Lipschitz function such that |//jp(a) — /ijp(b)| < 
Qipll^ — t)||2. Let a be a vector comprised of i.i.d. zero-mean, unit variance Gaussian random variables 
and let eup > 0. Then 

P{\fup{^) - Efup{s.)\ > eup\Efup{s^)\) < exp | JIBeEI^ML I . (22) 



In the following lemma we will show that ^up{(^, g, h, x, rgocp, C'wup) is a Lipschitz function. As such it 
will then concentrate according to the above lemma. 

Lemma 3. Let g and hbe m and n dimensional vectors, respectively, with i.i.d. standard normal variables 
as their components. Let a > Obean arbitrary scalar Let S,up{(^, g, h, x, rsocp, C'wup) be as in (21). Further 
let eiip > be any constant. Then 

^\\'iup\^lg-l'^-l'^-l'fsOCp-i^'Wup) ~ ^'iupy'^lg-i'^-i'^-ifsOCp-i^'Wup)] — ^ijp I ^?«P I''') g) 'I) ^) ^SOCp) L^Wup j I j 

. f (Qip^C«p(o-,g,h,x,rsocp,Cw„J)^l 



Proof The proof will parallel the corresponding one from [62]. We start by setting 

/Hp(g(^\h«) =e„p(a,g«,h«,i,r,oep,Cw„J. (24) 

Further, let i/('*pi) and A^'^^'^^ be the solutions of the minimization on the right-hand side of (24). In an 
analogous fashion set 

/zip(g(2),h(2)) =^„p(a,g(2),h(2),i,,r,o,p,Cw„J, (25) 

and let z/^'^^^j and A^'*^^) be the solutions of the minimization on the right-hand side of (25). Now assume 
that fiip{g^^^ , h(^) ) / fiipig^"^^ , h(^) ) (if they are equal we are trivially done). Further let fupig^^^ , h^^) ) < 
/;jp(g(^) , h(^) ) (the rest of the argument of course can trivially be flipped if fiip{g^^^ , h^^) ) > fiipig^"^^ , h^^) )). 



We then have 

n 



n 



{{rsocp-{et'hei'^)V^)u^'^^^^hJCl^^ J] Af^^^i, 



i=n— fc+l 






((r.ocp-(6?)+ei^^)^/^)i^(^^^^) + Jci„^+cT2||g«||2Z.('^P^)-Cw„J|z.('^^^)h«+^^^^ j; Ap^^i, 



i=n— fc+l 



Ci„ +a2(||g(2)||2-||g«||2)z.('*P^)-Cw„,(||/^('*^^)h(2)Wl)-A('^Pl)||2-||z.('*P^)h(2)+zW-A('^Pl)||2^ 



< C^.(V^i., + ^^llg^') -g^^^ll2 + Cw„J|h(2) -h«| 



< C,^2Cl^^ + ay ||g(2) - g{i)||2 + (||h(2) - h(i)||2), (26) 

where the first inequaUty follows by sub-optimality of zy('*Pi) and A^'*^^) in (25). Connecting beginning 
and end in (26) and combining it with (24) one then has that Cupi^, g, h, ic, rsocp, C'wup) is Lipschitz with 
Clip = Cu \/2C^ + (T^. (23) then easily follows by Lemma 2. D 

Let ^Q, and A„p be the solutions of the optimization in (21). One then has that ||i^h + z*-^-* — A„p ||2, 
u^p concentrate as well. More formally, one then has analogues to (23) 



(normuv) 

n 



-T (I llfupii + z^ ^ — Anp ||2 — -C'||^'Mpn + z^ ^ — Aup II2I > 6;^ £/||z/„pn + z^ ^ — A„p II2J < e 2 

(,, \ ("up) 

PdzATp-^zATpl >er''^^'V) < e-^2 '(^7) 



where as usual g^""^'""^^ > q and ei"^^ > are arbitrarily small constants and ^^'^"^'^p> and e^'' are 
constants dependent on g^""''™"^-' > q and e^''"^^ > 0, respectively, but independent of n. 



Set 



fobj = E^upicr, g, h, X, rsocp, Cw„p) - eiip\ES,up{cr, g, h, x, rsocp, Cw„p)|, (28) 



where e^jp > is an arbitrarily small constant. From (20) one then has 

P. > PK.,(<.,g,h.i..r,„,.c,.,) > /lJ)(i-.-4-'") > (1 - e,p / _ ''•■'^'^'"^^^- '^;^-'>'^ j j (i-.-4«») 

(29) 
(29) is conceptually enough to establish a "high probability" upper bound on /o^j. What is left is to connect 
it with (15). Combining (29), (16), and (15) we then obtain 



{up) ^ 7(^^ ^ (1 „..^ J 'y^iip^^npjcr, g, h, X, r^ocp, Cw„p)) [ ^ ^^ „-4'''n^n ^-A^^nx 

(30) 



^(/^rr > /rr ) > u - exp - ^ " 2(2Ci +r^) r' " '" ^^^' " ''" 
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where we used the fact that g is the standard normal and therefore P{g — eg \/n < 0) > (1 — e "^4 "^ 
for an arbitrarily s 
constant such that 



for an arbitrarily small eg > and a constant ef dependent on eg but independent of n. Let eupper be a 



1 - e--"-- < (l - ^^\- ^"-'''^ic:^f' ';7,' ^-'"^ } j (1 - e-4-'>.)(l - e-H (31) 

We now summarize results from this subsection in the following lemma. 

Lemma 4. Let v be an n x 1 vector of i.i.d. zero-mean variance a^ Gaussian random variables and let 
A be an m X n matrix of i.i.d. standard normal random variables. Consider an x defined in (6) and a y 
defined in (3) for x = x. Let then fobj be as defined in (10) and let w be the solution of (14). There is a 
constant Supper > defined in (31) such that 

P{Uj < fi,^r^) > 1 - e-^"--", (32) 

where 
fJMPper) ^ _^^^^^^^ ^ Y^ ^^ j.^^^^^ Cy^^J + eiip\E^upia, g, h, x, rsocp, C'w^j,)! + ep\/n + €l^'^/n, (33) 

Cup{<^, g, h, X, Vsocp, C'w„p) is as defined in (21), eup, e\ , eg are all positive arbitrarily small constants, 
and Cwup is a constant such that \\'w\\2 < C-w^p- 

Proof. Follows from the discussion above. D 

2.2 Lower-bounding fobj 

In this section we present the part of the framework that relates to finding a "high-probability" lower bound 
on fobj. To make arguments that will follow less tedious we will already here make an assumption that is 
significantly weaker than what we will eventually prove. Namely, we will assume that there is a (if necessary 
arbitrarily large) constant Cw such that 

P(||w||2 < Cw) > 1 - e-^^w", (34) 

for an arbitrarily large constant Cw and a constant ec„ > dependent on Cw but independent of n. The 
flow of our presentation would probably be more natural if one provides a direct proof of this statement 
right here. However, given the difficulty of the task ahead we refrain from doing that and assume that 
the statement is correct. Roughly speaking, what we actually assume is that ||wsocp||2 is bounded by an 
arbitrarily large constant (of course, as mentioned above, we hope to create a machinery that can prove 
much "bigger" things than (34)). 

Now we will look at the following optimization problem 



min ||y — Ax\ 



2 



u- t t II II ll~ll ^ /.(lower) /■oc\ 

subject to ||x||i - ||x||i < /^^^. '. (35) 

If we can show that for certain f^bj"^'^ '^^e objective of (35) is with overwhelming probability larger then 
Vsocp, then fobj"^'^ will be a "high-probability" lower bound on the optimal value of the objective of (10), 
i.e. on fobj- Hence, the strategy will be to show that for certain fobT"^^ the optimal value of the objective 
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in (35) is with overwhelming probability lower bounded by a quantity larger than Vgocp- We again start by 
noting that if one knows that y = y4x + v holds then (35) can be rewritten as 



min ||v + ^x — ylx||2 

X 

u- tt II II ll~ll ^ /.(lower) 

subject to ||x||i - ||x||i < /^j,^. ^ 



(36) 



After a small change of variables, x = x + vi^, (36) becomes 

||v — ^w||2 



nun 



u- t t II ~ 1 II ll~ll ^ /.(lower) 

subject to ||x + w||i - ||x||i < f^^. ', 



or in a more compact form 



(37) 



min \\A^ 



u- t t II ~ , II II ~ II ^ /.(lower) 

subject to ||x + vi^lli - ||x||i < Z^^^. ', 



(38) 



where as in the previous section Ay, = [—A v] is now an m x (n + 1) random matrix with i.i.d. standard 
normal components. Set 



Cobj = min \\A^ 



u- t t II ~ I II II ~ II ^ /.(lower) 

subject to ||x + w||i - ||x||i < f^^^- '. 



(39) 



Let 



Q / ~ f~i Mower)-. _ J 
Ow(0", X, Ow, /o5j ) — \ 



£R 



.r}-4-^\ II II ^ •^ 1 II ~ , II II - II ^ p(lower)-, ^Arw 

^' ||w||2<Cw and ||x + w||i - ||x||i < /^^^. '}. (40) 



Set 



/.(help) . II , 

Cobj = m™ pv 



1 2 = min max a A-y 



(41) 



Now, after applying Lemma 3.1 from [37] one has 



P 



mm 



[wra]Te5w{a,x,Cw,/^l7*=''')li'^ll2=l 



max I a A 



+ \/\Ml + (T^9] >C 



(0 

obj 



> P I mm max 

[wr^]TeSw(<T,x,Cw,/(',7'='-))llail2=i 



rw 



+ 0-2 ^ gjaj + ^ hjWj + hn+io- > C, 



(0 

obj 



i=l 



i=l 



(42) 
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In what follows we will analyze the following probability 



Pl=P\ min max K/||w||| + a^ y^g^aj + y^hjWj + h„+io- > Q 

\^[wTa]Te5w(a,x,c;w,/i^7'=''') Il-ll2=i V ^ ^ ^ J 

' (43) 

which is of course nothing but the probability on the left-hand side of the inequality in (42). We will 
essentially show that for certain q/ this probability is close to 1. That will rather obviously imply that we 
have a "high probability" lower bound on (obj- Moreover, if such a lower bound is larger than rgocp we will 
be done in terms of establishing a "high probability" lower bound on foi,j. To that end, we first note that the 
maximization over a is trivial and one obtains 



Pi = P\ min iJ\Ml + ^^s\\2 + y^h,wA+K+^a>Cl\■ (44) 

To facilitate the exposition that will follow let 



^(a,g,h,i,/ 7'^'-)) = min ^||w||2 + a^Hgl^ + J^h.w, . (45) 



n 



[wTa]Te5„(<x,x,Cw,/^i7^'-^) V ' -I 



Since Cw is not a substantially important parameter in our derivation we omit it from the list of arguments 
of ^; this a practice that we will adopt many occasions below, fairly often, without explicitly mentioning it. 
Also, one should note here that, although present in the definition of S^, o- clearly does not have an impact 
through Sw on the result of the above optimization. Now we split the analysis into two parts. The first one 
will be a deterministic analysis of ^{a, g, h, x, f^^^^^ ) and will be presented in Subsection 2.2.1. In the 
second part (that will be presented in Subsection 2.2.2) we will use the results of that analysis and continue 
the above probabilistic arguments applying various concentration results. 

2.2.1 Optimizing ^(a, g, h, i, f^J,^"^"''^) 

In this section we compute ^{a, g, h, x, f^^*^^ )■ We first rewrite the optimization problem from (45) in the 
following form 



^f 1 ~ p(lower)\ . /ii 112 , on ii , V^ i 

C(cr,g,h,x,/^^^. 0=min ^ ||w||^ + o-^||g||2 + 2^hiWi 



j=i 



u- t t II ~ , II II ~ II ^ Mower) 

subject to ||x + w||i - ||x||i < /^^^. 



|w||2 + c72< VC2, + a2. (46) 



From this point one can proceed with solving the above problem through Lagrangian duality. However, 
instead one can recognize that the above optimization problem is fairly similar to (23) in [62]. The difference 
is only in the constant term in the first constraint. After carefully repeating all the steps between (23) and 



13 



(39) in [62] one then arrives at the following analogue to (39) from [62] 



e(a,g,h,x,/([7-))= max a0||g|b+7)^-||h + .z(i)-A(2)||i- J] Af ^x, - tTC^T^ - ./ig-^) 

' ' i=n—k+l 

subject to u > 

< Af ^ <2iy,l<i<n 

||g||2+7- ||h + z/z(^) - A(^)||2 > 

7 > 0. (47) 

Now, the maximization over 7 can be done. After setting the derivative to zero one finds 

llslb + 7 



^=^ - VCl + a2 = (48) 

|g||2+7)2-||h + J/z(i)-A(2)||2 

and after some algebra 

lopt = J I + ^l|h + z/z(« - A(2)||2 - ||g||2, (49) 

where of course 70^4 would be the solution of (47) only if larger than or equal to zero. Alternatively of 
course 7opt = 0. Now, based on these two scenarios we distinguish two different optimization problems: 

1. The "overwhelming" optimization 



U(cT,g,h,i,/^i7^^^) = max a^||g||2_||h + ^za)-A(2)||i- J] Af i, - ^/, 

' i=n— fc+l 

subject to V > 



< Af ^ <2iy,l<i<n. (50) 



2. The "non-overwhelming" optimization 



U,(a,g,h,i,/(g"^^)) = max y^ClT^\\gh - C^\\h + i^z^ - X^^)\\, - ^ A^ i, - ^/^ 

' j=n— fc+l 

subject to u > 

< Af ^ < 2zy, 1 < i < n. (51) 

The "overwhelming" optimization is the equivalent to (47) if for its optimal values z> and A(2) one has 



a^ 



'l + -^||h + Z>Z«-A(2)||2<||g||2, (52) 



'w 



We now summarize in the following lemma the results of this subsection. 

Lemma 5. Let i) and A^^) be the solutions of (50) and analogously let v and A^^) he the solutions of (51). 
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Let ^{a, g, h, x, f^^'^^) be, as defined in (45), the optimal value of the objective function in (45). Then 



^(cr,g,h,X,/, 



(lower) \ 
obj ) 



rr\ WeP - llh -\- uyW - A(2)||2 _ V" A^ -"x- - uf 



(2), 



f {lower) 
obj ' 



if 



l+^l|h+z>z(i)-A(2)||2 



v/C2:T^||g||2 - Cwllh + ^z(i) - A(2) II2 - Er=n-.+i Afx, - ^/(g^^''), Otherwise 



Moreover, let w be the solution of (45). Then 



(53) 



< 1 



^ / , ~ Mower) s 

w(a, g, h, X, /^^^. ') 



^llglli-j|h+i>z(i)_M2)|i: 

Cw(h+i>z(i)-A(2)) 
l^ ||h+i>z(i)-A(2)||2 ' 



, if ./l + -4||h + Z>zW-A(2)||2<||g||2 



(54) 



otherwise 



and 



I * / 1 ~ r(lower)\ 

|w(a,g,h,x,/^^^. ') 



2= < 



g||h+i>z(l)-A(2))||2 

Vllg|li-||h+z>z(i)-A(2)||2 

Cw , Otherwise 



if ./l + ^||h + z)z«-A(2)||2<||g||2 



a 



(55) 



Proof. The first part follows trivially. The second one follows the same way it does in Lemma 2 in [62]. D 

2,2,2 Concentration of ^(cr, g, h, x, f]^Z"'^ ) 

In this section we establish that ^{a, g, h, x, f^^*^^ ) concentrates with high probability around its mean. 

Lemma 6, Let g and hbe m and n dimensional vectors, respectively, with Ltd. standard normal variables 
as their components. Let a > be an arbitrary scalar Let ^{a, g, h, x, f]^^^^ ) be as in (45). Further let 
eiip > be any constant. Then 



P(|e(a,g,h,i,/ig-^^))-i^C(^,g,h,x,/ig-^'-))| > 6,,,i5;C(cT,g,h,i,/il7^^))) < exp 



(6,,ii;^(a,g,h,x,/(g--)))2 



2(2Ci + a^) 



(56) 



Proof. The proof is the same as the proof of Lemma 4 in [62]. The only difference is the structure of set S^ 
which does not impact substantially any of the arguments in the proof presented in [62]. D 

One then has that ||h + z>z(^) — A(2)||2, ||h + i>z^^^ — A(2)||2, z>, and z> concentrate as well which 
automatically implies that w also concentrates. More formally, one then has analogues to (56) 



{norm) 



P{\ ||h + z>z(i) - A(2)||2 - E\\h + i>z(i) - A(2) II2I > eP'"^^||h + z>z(i) - A(2)||2) < e 

P{\\\h + Pz« - A(2)||2 - E\\h + PzW - A(2)||2| > ef ""'"^i^llh + Pz« - X(^)h) < e'^^""™'" 



(") 



(") 



P(|i> - Ei)\ > ef^Eu) < e 
P{\u-ED\>e^^^ED) < e-'i"'" 
P(|||w||2-^||w||2| > ei'^^-E||w||2) < 6-^2"'", (57) 
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where as usual ^^"^ > 0, e^^°^"^' > q, e^ > 0, eg > 0, and el"' > are arbitrarily small constants 

J (norm) (norm) (u) (u) , (w) , , j j t (norm) ^ „ (norm) ^ r\ M -^ r\ 

and €2 ,64 , 62 » €4 , and £3 are constant dependent on e^ > 0, €3 > 0, e\ > 0, 

€3 > 0, and e^ > 0, respectively, but independent of n. 

Now, we return to the probabilistic analysis of (44). Combining (44), (45), and (56) we have 



PI = P\_ ^^ min ^||w||2 + C72||g||2 + ^h,w, + hn+ia > Coij 






fa,J^ g(a,g,h,x,/ig-)))^ 



(58) 



where we consider only the interesting case E^{a, g, h, x, f^^'^^)) > 0. Since h^+i is a standard normal 

one easily has P(h„_|_io- > —e]^ yn) > 1 — e~'^2 " where €[ > is an arbitrarily small constant and 
€3 is a constant dependent on e^ and a but independent on n. By choosing 

C% = (1 - <^Hp)m'^, S, h, X, /(J^^^)) - ef'^V^, (59) 

one then from (58) has 

Pi = P\ mill J||w||2 + a2||g||2 + VhjWj +h„+ia> (l-e;jp)£;C(f^,g,h,i,/^J,°^^^ 

>|l-exp{ 2(2Ci + a^) M(l-e^^"). (60) 

(60) is conceptually enough to establish a "high probability" lower bound on (obj- Mimicking the steps 
between (58) and (64) in [62] one obtains the following analogue to (64) in [62] 



= P( min (||A 



[w^<x]Te5w(<x,x,Cw,/^i7'='-') 



2) > dl7'''^)(l - e-^c»") > (1 - e-^'°--")(l - e-^^»"). 



(61) 



where 

^(lon^er) ^ ^^ _ ^^^^^^^^^^ ^ ^ ^^ ^(lo^er)^ _ ^(h)^ _ ^(,)^^ ^^^^ 



and 



1 - e '''°""='-"' < 1 - exp ■' -^ 



2(2Ci + a^) 
We summarize the results from this subsection in the following lemma. 
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(h) (9) 

\ (i_e-^2 ")(i_e-^i "). (63) 



Lemma 7. Let w be an n x 1 vector of i.i.d. zero-mean variance a^ Gaussian random variables and let 
Abe an m X n matrix of i.i.d. standard normal random variables. Consider an x defined in (6) and a y 
defined in (3) for x = x. Let then (^bj be as defined in (41) and let w be the solution of (41). Assume 
^(11 w||2 < Cw) > 1 — e~'^^^'^for an arbitrarily large constant Cw and a constant ec^ > dependent on 
Cw but independent ofn. Then there is a constant eiower > 

PiCobj > dT'^) > (1 - e-^-™-")(l - e-^o^n, (64) 



where 



flower) _ ,-, \Tpc(^ „ U ^ Mower)-. (h) /— (g) 



C;r = (1 - ^u,)Ei{a,^,h,ijX ) - ^rV^-^fV^^ (65) 

^((T, g, h, X, f]ji^j"^^ ) is as defined in (45) (and can be computed through (50) and (51)), and eup, e\ , e^ 
are all positive arbitrarily small constants. 

Proof. Follows from the discussion above. D 

The above Lemma achieves one of the goals established at the beginning of this section. Namely, for a 

foh-j"^'^ it establishes a high probability lower bound Qbj"'^^ '^^ ^obj- As we stated earlier, if we can find 

fobi"^^ such that Qbj"'^^ > i^socp then f^^*^^ is a high probability lower bound on fobj. Moreover, we 

hope that /^^J^^*^ ~ fobj"^ ^i^d that (7w„p for which this would happen is such that Cw„p ~ Hw^ocp lb- 
All of this is established in the following section. 

2.3 Matching upper and lower bounds 

In this section we specialize the general bounds f^^^^^ and f^^^^^ introduced above and show how they 
can match each other. We will divide presentation in several subsections. In the first of the subsections we 
will make a connection to the noiseless case and show how one can then remove the constraint from (53), 
(54), and (55). In the second and third subsection we will specialize the upper and lower bounds on fobj 
computed in Sections 2.2 and 2.1 and show that they can match each other. In the fourth subsection we 
will quantify how much the lower bound on C,oi)j that can be computed through the framework presented in 
Section 2.2 for a "suboptimal" w deviates from the "optimal" one obtained for w. In the last subsection we 
will connect all the pieces and draw conclusions regarding the consequences that their a combination leaves 
on several SOCP parameters. 

2.3.1 Connection to tiie li optimization 

In this subsection we establish a connection between the constraint in (53), (54), and (55) and the funda- 
mental performance characterization of i-i optimization derived in [64] (and of course earlier in the context 
of neighborly polytopes in [26]). What we present here is exactly the same as what was presented in the 
corresponding section in [62]. However, given its importance/relevance to the current analysis we include it 
here again. We first recall on the condition from Lemma 5. The condition states 



C72 



'l + -^||h + Z>Z« - A(2)||2 < ||g||2, (66) 



w 
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where Cw is an arbitrarily large constant and v and A(2) are the solution of 



i=n—k+l 



subject to < \f^ <2u,l <i<n 

z^ > 0. (67) 

Now we note the following equivalent to (67) in the case when nonzero components of x are infinite 



max 



aJ\\gg-\\h + uzW-X(^)g 



subject to < Af ^ <2iy,l <i<n-k 
Af ) =o,n-/c + l<i<n 
z^ > 0. (68) 

To make the new observations easily comparable to the corresponding ones from [63,65] we set 

h = [|h|(i), |h||2), . . . , |h||^I^|,h„_fe+i, h„„fc+2, • • ■ ,Kf , (69) 

where [|h|Lx, |h|U, • • • , |h|[^~^N] are the magnitudes of [hi, h2, . . . , h„_fc] sorted in increasing order (pos- 
sible ties in the sorting process are of course broken arbitrarily). Also we let z^^^ be such that z- = 

—z- ,n — k + l<i<n and z^ = z^^ , 1 < i < n — /c. It is then relatively easy to see that the above 
optimization problem is equivalent to 



max 



aJ||g||2-||h-Z.z(2)+A(2)||2 



(2) 

subject to < Aj < z^, 1 < z < n — /c 
Xf) =0,n-/c + l<i<n 
zv > 0. (70) 

Let U£^ and X^^''-' be the solution of the above maximization. Then, as we showed in [65] and [64], the 
inequality 

^||g||2 > ^||h - z/^iz(2) + a(^i) II2 (71) 

establishes the following fundamental performance characterization of the ii optimization algorithm from 
(2) that could be used instead of SOCP to recover x in (1) (which is a noiseless version of (3)) 

/7g-(erfinv(i-2^))2 
(1 - M-^ y2erfinv(-— ^) = 0. (72) 

Clearly, in (72) one has a^ = ^ and I3y^ = ^. As it is also shown in [65] and [64] both of the quantities 
under the expected values in (71) nicely concentrate. Then with overwhelming probability one has that 
for any pair {a, j3) that satisfies (or lies below) the above fundamental performance characterization of £1 
optimization 

||g||2>||h-z.,,z(2) + A(^i)||2. (73) 
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(2) 

Moreover, since A^ > 0, n — /c + 1 < i < n, in (67) one actually has that (73) implies that with over- 
whelming probability 

||g||2> ||h + z>z«-A(2)||2, (74) 

which for sufficiently large Cw is the same as (66). We then in what follows assume that pair {a, (3) is such 
that it satisfies the fundamental ii optimization performance characterization (or is in the region below it) 
and therefore proceed by ignoring the condition (66). (Strictly speaking, all our overwhelming probabilities 
below should be multiplied with an overwhelming probability that (72) holds; to maintain writing easier we 
will skip this detail.) 

2.3.2 Optimizing fobj 's upper bound 

In this section we will lower the value of the upper bound created in Section 2.1 as much as we can by a 
particular choice of Cw„p. Let ^duaiio; g, h, x, rsocp) be 



- u \\^\\2i^ — u,\\iyii -r ^' ' — /\ 112" / ^ 

subject to u > 



CduaK^' Si h, X, r^ocp) = min max ^/d?+a^\\g\\2ly - d\\uh + z^^'^ - X^'^'^y - V] \ x, 

i=ra— fe+1 



< Af ^ < 2,1 <i<n. (75) 

Rewriting (75) with a simple sign flipping turns out to be useful in what follows 

n 

-Cdua^o") g> h, X, r^ocp) = max min - ^/(P+a^ \\g\\2U + d\\uh + z^^'^ - X^'^^'y + V] Xl'xi + urgocp 

d>0 i/ A(2) . ■^ , _, 

i=n—k+l 

subject to V > 

< Af ^ < 2, 1 < i < n. (76) 

The following lemma provides a powerful tool to deal with (76). 
Lemma 8. Let iduai{'^^ g) h, x, rsocp) be as defined in (76). Further, let 



^^socp 



-■^prJm(o-, g,h,x, r^ocp) = min max - \/ d? + a'^\\g,\\2V + d\\vh. + z^'^'' - A^^^||2 + ^ \ Xj + 

' ~ i=n—k+l 

subject to z/ > 

< Xf^ < 2, 1 < i < n. (77) 

Then 

QdualK^j g) J^) X; l^socp) ^ sprimV'-'") g) '^i -X) '^socpj- ('") 



Proof. After solving the inner maximization over d in (77) one has 

■|j,h + zW-A(2)||2 



dopt = cr (79) 

/ll„ll2,,2 \\,,U I „fl1 \(2)\\ 



|g||2z.2_||j,h + z(l)-A(2)|l2 
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Such a d then estabhshes that the right-hand side of (77) is 

" 

-Cprim(o-, g, h, X, r^ocp) = min -ay ||g|||i/ - ||z/h + z(i) - A(2)||2 + ^ Af -"xj + 



' i=n—k+l 



subject to ly >0 



< Af ^ <2,1 <i<n. (80) 



Now we digress for a moment and consider the following optimization problem 

n 

min -fjqi + ^ A| ^Xj + z/rsocp 



i=n— fc+l 

subjectto llzyh + z*^^) - A^^)||2 < 



q2 

Ql+qi < ||g||2i^ 
1/ > 
< Af ^ < 2,1 <i<n. (81) 

Let — Wjm(^' S' ^' -^^j ''socp) be the optimal value of its objective function. Let quadruplet 0, A^^) , qi , q2 be 
the solution of the above optimization problem. Then it must be 

||i>h + z(^) -A(2)||2 =q2 (82) 

and consequently 



qi = V llslli'^^ ~ 11^^ + ^^^'' ~ ^^^^lli 



n 



-4™(^'g'^'^'^-'=p) = -^Vllgll2'>'-|l^h + z(i)-A(2)||2+ Y, Af ^i, + i>r,,ep. (83) 

i=n—k+l 

The above claim is rather obvious but for the completeness we sketch the argument that supports it. Assume 

'prir 

would be larger then the expression on the right-hand side of (83). Now, since (82) and (83) hold one has 
cW ( 



that ||z>h + z(i) - A(2)||2 < q2. Thenqi < ^ WgWlv"^ - ||i>h + z(i) - A(2)||2, and -^y.„(cj,g,h,i,r,ocp) 

arger then the expression on the right- 
that —inJimi'^i g) h, X, rsocp) can be determined through the following equivalent to (81) 



-4rL(^>g'h,i,r,oep) = min -c7^||g||2i/2 _ ||,,h + zW - A(2)||2 + Y, A^^i + 

' i=n—k+l 



< Af ^ < 2, 1 < i < n (84) 
After comparing (80) and (84) we have 

SprimV'-'^' S' ' "^' '"socpj ^ Sprjmi'^; g) il) X, r^gcpj- ("JJ 

Now, let us write the Lagrange dual of the optimization problem in (81). Let d and 71 be Lagrangian 
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variables such that 

n 

max mill -o-qi + ^^ A- ^Xj + (i||z^h + z*^^) - A^^^||2 - (iq2 + TiV q? + qi " Tillglb^^ 

d>0,7l>0.,A(2),qi,q, ^^^^^ V 

subject to u > 

< Af ^ < 2, 1 < i < n. (86) 

After solving the inner minimization over qi , q2 and maximization over 71 one finally has 

n 

max min - V cr^ + d^llglb'^ + / X\ 'xi + d\\h'h + z^^' - X^'^'IU + ursocp 

«=n— K+1 

subject to u > 

< Af ^ < 2, 1 < i < n. (87) 

Let — Crim(o", g, h, x) be the optimal value of the objective function in (87). Since (87) is the dual of (81) 
and since the strict duality obviously holds (the optimization problem in (81) is clearly convex) one has 

-?prL(^' g' h, i, rsocp) = -4rL(^' g' h, X, Tsocp)- (88) 

On the other hand the optimization problem in (87) is the same as the one in (76) and therefore 

- vL('7' g' h, X, rsocp) = -S,dualicr, g, h, x, rsocp)- (89) 

Connecting (85), (88), and (89) one finally has 

<,dual\^T St "-T^T^socp) — spriml,'^) g; ^j X, T^ocpj V^^) 

which is what is stated in (78). This concludes the proof. D 

- , (2) ^^ (2) 

Let d,Uup,Xup be the solution of (75) (or alternatively let Vup^Xup be the solution of (77) or (80)). 



Clearly, 



^ WU^miy + Z^ ^ — A,.ii 2 

V llglli'V^ - ||ivh + z(i) - A(2)||2 
As shown in Section 2.1 all quantities of interest concentrate and one has 

Ed = a II »p ^ "Piu ^^^ ^ ^g2) 

' EMlEi;::/ - E\\i^ph + z(i) - A(2)||2 

where = indicates that the equahty is not exact but can be made through the concentrations as close to it as 
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needed. Now, set Cw^p = Ed in (21). Then a combination of (21), (75), and Lemma 8 gives 



ECupic7,g,h,±,r,ocp,Ed) = E max [J (Edy + a^g\\2iy-Ed\\i^h+z<^^^-X^^'>)\\2- V A^i, 

A(2)eA(2),j.>0 ^ • ,,-, 

~ j=n— fe+l 

n 

= £;min max {'\/d? + a^\\g\\2iy-d\\iyh+z^^^-X'-'^^)\\2- V] X\ '5i.i-iyrsocp) = E^pr^m{(T,g,h,i,rsocp)■ 
~ j=n— fe+l 

(93) 
Moreover, one then from (80) has 



-E^prim{cr,g,h,ii.,rsocp) = -cr\J E\\g\\lEi^/ - ^||tQ,h + z(i) - Alp^|||+^( ^ {X^up)iy^i)+Ei 
^ ^ ^94) 

(2) (2) 

where {Xup)i is the i-th component of Xup- 

Let w^ be the solution of (14). Then ii^||w^||2 = Cw„p = Ed and with overwhelming probability 

fobj < fobj < E^prim{cr, g, h, X, rgocp) + eiip\E^prim{cr, g, h, X, rsocp)\ for an arbitrarily small positive 
constant eup {Ed is of course as defined in (92)). In the following section we will show that with over- 
whelming probability fobj > fobj"''''' > ^Cp«m(cr,g,h,x,rsocp) - eup\Eiprim{a,g,h.,±,rsocp)\ which 
will establish E^prim{cr, g, h, x, rsocp) as the concentrating point of fobj- Moreover, we will show that if 
"^socp is such that i^Hw^ocplb substantially deviates from -E'||w^||2 then fobj would substantially deviate 
from E^prim{a, g, h, x, Vsocp) which will establish -E||w^||2 = Cw^p = Ed as the concentrating point of 

llw^ocplb- 

2.3.3 Specializing fobj 's lower-bound 

In this section we finally determine the concentrating point of fobj- To that end let us assume 



.cpj'''sOCp) 



i=n—k+l 

(95) 
where e-r^ > is an arbitrarily small but fixed constant. From (50) one then has 



n 



Coi, (fT,g,h,i, rsocp) = max a ^J \\g\\l - \\h + uz(^) - X^^') \\l - ^ >h^ ^i - ^ fobj"""'^ 

' i=n—k+l 

subject to V >{) 

< Xf^ <2u,l<i<n. (96) 



x(2) 



Let us choose v = JU and A^^^ = ^=21- in the above optimization. Since this choice is suboptimal and since 
all the quantities concentrate (95) would imply 

ECov{(r, g, h, X, rsocp) > (1 + er,,,j,)rsocp- (97) 
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On the other hand based on a combination of the arguments from Section 2.3.1 and (97) one would also 
have 

i?e(^,g,h,x,/^g"'^'')) = i?U(c^,g,h,i) > (1 + e,_Jr,oep. (98) 

Finally a combination of (98) and Lemma 7 would give 

P{Uj > (1 + er.„.J(l - eup)rsocp - ef'^V^ - e^f V^) > (1 - e"^--"")!! - e'^^w")^ (99) 

where for any arbitrarily small but fixed e^^o^p one can choose much smaller ejjp, e\ , e^^ and make their 
presence in the above inequality negligible. On the other hand, in a statistical sense, (99) would contradict 
the setup of (9). Therefore our assumption that /^f,"'"'^^ satisfies (95) is with overwhelming probability 
unsustainable. A combination of (99), (93), (94), results from Lemma 4, and the discussion right after 
Lemma 7 imply that f^hj concentrates around E^prim{o', g, h, x, rgocp)- 

2.3.4 II Wsocp 1 1 2 's deviation from 1 1 w^ 1 1 2 

In this subsection we will show that Hw^ocplb can not deviate substantially from ||w^||2 without substan- 
tially affecting the value of the lower bound on the objective in (9) that is derived in Section 2.2. To that 
end let us assume that there is a "Woff that is the solution of the SOCP from (9) (or to be slightly more 
precise that is such that :x-socp = x + y^off, where obviously Xgocp is the solution of (9) or (4)). Further, let 
I II Wo// II 2 - ||w^||2| > ew„p||w^||2, where e^^^ is an arbitrarily small constant. 

One can then proceed by repeating the same line of thought as in Section 2.2. The only difference will be 

that now Cw = ||vifo//||2 and consequently in the definition of S'w(o",x, Cw,/^^""'^'^), ||w||2 < Cw changes 
to ||w||2 = Cw = ||wo//||2. This difference will not of course affect the concept presented in Section 2.2. 
The only real consequence will be the change of (46). Adapted to the new scenario (46) becomes 



Co//(cr,g,h,x,rsocp, IIW0//II2) =min JW^offWl + o-2||g||2 + ^h 



subject to ||x + w||2 - ||x||i < £',fprim(o-,g,h,x,rsocp) 



w 



+ (j2< J||Wo//||2+cj2. (100) 



One can then proceed further with solving the Lagrangian to obtain (this is pretty much analogous to what 
was done in Section 3.3.2 in [62]; the only difference is a subtle change in the first constraint) 



^o//(cr,g, h,x,rsocp, IIW0//II2) = max (vl|wo//||2 + o-2||g||2 - ||wo//||2||h + z/z^^^ - A(^))||2 

A(2)eA<2j,/.>0 ^ 

n 

~ X^ Af^Xj -z^^^prim(o-,g, h,x,r5ocp)), (101) 

j=n— fc+l 

where A2J = {A*^2)|o < A^ < 2i/, 1 < i < n}. Using the probabilistic arguments from Section 2.2 one 
then from Lemma 7 has that if Wo// is the solution of (9) then the objective value of (38) (or the objective 
value of (35)) is with overwhelming probability lower bounded by (1— eiip)i?^o//(o'5g, h,x,rsocp5 ||wo//||2) 
(^o//(o-, g, h, X, Tsocp, II Wo// II2) is structurally the same as iup[(y. g, h, x, r^ocp, C^up) from (21) and there- 
fore easily concentrates based on Lemma 3). We will now consider in parallel the following lower bound on 



23 



the objective value of (38) that is presented in (50). 



^oi,(o-,g,h,x,rsocp) = max cjA/||g||| - ||h + z/z(i) - A(2)||2_ V^ Xf'xi-uE^prim{(^,g,h,x,rsocp))- 

_ (102) 

Let i> and A(2) be the solution of (102) and let 



C/*eip(cr,g, h,x,rsocp, IIW0//II2) = yllwo/zlli +o-2||g||2 - || W0//II2 ||h + z>z(^) - A(2)||2 

n 



j=n— fc+1 



Repeating the arguments presented between (115) and (122) in [62] one obtains the following analogue to 
(122) from [62] 



el 



E^off{cr,g,h,ii.,rsocp, W^offh) - E^ov{cr,g,h,it,rsocp) > ^.. "" — ^E^e, 



(104) 



where ^ij = a^J {E\\g\\2)'^ - {E\\h + i>zW - A(2)||2)2. As shown in Section 2.3.3 if one has that /^J,°"""'^ = 
E^prim{(^, g,h,iii,rsocp) (which is the case in (101)) then ^^o^(o-,g,h,x,rsocp) > rgocp- Knowing that, 
(104) basically shows that if Hw^ocplb were to deviate from ||w^||2 the optimal value of the objective in 
(38) would concentrate around point that is non-trivially higher than rgocp (note that E^e ~ V^)- This again 
contradicts the setup of (9) and makes our deviating assumption unsustainable with overwhelming probabil- 
ity. Hence Wgocp is such that Hw^ocplb concentrates around -E'||w^||2 with overwhelming probability. 

2.4 Connecting all pieces 

In this section we connect all of the above. We will summarize the results obtained so far in the following 
theorem. 

Theorem 1. Let v be an n x 1 vector ofi.i.d. zero-mean variance a^ Gaussian random variables and let 
A be an m X n matrix of i.i.d. standard normal random variables. Further, let g and h be m x 1 and 
n X 1 vectors ofi.i.d. standard normals, respectively. Consider a k-sparse x defined in (6) and a y defined 
in (3) for x = x. Let the solution of (4) be Xgocp cmd let the so-called error vector of the SOCP from (4) 
be Wsocp = ^socp — X. Let Vgocp in (4) be a positive scalar Let n be large and let constants ol = ^ and 
Pw = - be below the fundamental characterization (72). Consider the following optimization problem: 



Il2i^- - ||<^ii-r:^^-' - /^^-'||2 - ^ > " 

j=n— fc+1 

subject to z/ > 



Cprim(o-, g,h,X, Tsocp) = max 0-y ||g|||z^2 _ ||;yh_^2(l) - A(2)||2 _ ^ Af^Xj 

' j=n— fc+1 



< Af ^ < 2, 1 < i < n. (105) 



, C2) 

Let Vup iind Xup be the solution of (105). Set 



I II WJ^upn + z"- ' - Aup\\2 ,.„^. 

|w„p||2 = o- ; ,^.^ =- (106) 

|o-||2,7^2 _ ,,--, (1) _ \(2)||2 

\s>\\2'^up W'^up'-^ I ^^ ' ^up ||2 
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Then: 

P(||x + Wsocplll - ||x||l G (£',^prim(o-, g, h,X,rsocp)) - e(°'''^'\Eiprim{(T,g„\l,y.,rsocp))\, 

Eiprim{c, g, h, X, rsocp)) + Ci | -E^prim (o", g, h, X, rsocp))\) = 1 " e"'2 " (107) 

P((l - e^r^^)E\\^ph < W^socph < (1 + e['"'^^)E\\^p\\2) = 1 - e'^^"^'^", (108) 

where e^°'^^' > is an arbitrarily small constant and ^2^^ is a constant dependent on e^"'^^' and a but 
independent of n. 

Proof. Follows from the above discussion and a combination of (50), discussions in Section 2.3.1 and those 
after (99) and (104), and Lemmas 4 and 7. D 

The above result is fairly powerful. In a sense it is for the SOCP algorithms what Theorem 2 from 
[62] is for the LASSO algorithms. It enables one to compute many quantities that could be of interest in 
characterizing performance of SOCP algorithms. For example, one can precisely estimate the norm of the 
error vector for the SOCP and can do so for any given fc-sparse vector x. Furthermore, all of it is done 
through a transformation of the original SOCP from (4) to a much simpler optimization program (105). 
While many quantities of interest in SOCP recovery can be computed through the mechanism presented 
above, below we focus only on a couple of quantities that relate to what we will call SOCP's generic 
performance scenario. Computation of all other quantities that we consider are of interest in generic or other 
type of performance scenarios will be presented in a series of forthcoming papers. 

2.4.1 SOCP's generic performance 

The results presented in the above theorem are rather general and can be used to analyze pretty much any 
possible scenario where SOCP algorithms can be applied. Here we will focus on the so-called "worst-case" 
scenario or as we will refer to it "generic performance" scenario. We will consider a simplification of (105) 
which, among other things, enables one to find a particular "generic" choice of Vsocp for which -E||w^||2 
from Theorem 1 can be upper-bounded over set of all x's. Let us now assume that all nonzero components 
of X in (3) are infinite. Then the simplification that we will consider will be (105) with such an x. In such a 
scenario the optimization problem from (105) clearly becomes 



^S(^'S'h,^socp) = max aJ\\g,\\lu'^ - ||z/h + z(i) - A(2)||2 - Wsocp 

subject to V >Q 

< \f^ =o,n-A: + l<i<n 

< \f^ <2,l<i<n-k. (109) 

Obviously, Cpllm ('^' S, h, Vsocp) < Cprimicr, g, h, X, Vsocp)- Then the following generic equivalent to Theo- 
rem 1 can be established. 
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Theorem 2. Assume the setup of Theorem 1. Consider the following optimization problem: 



^pr'imi'^^S,'i^,rsocp) = max aJ\\g\\liy'^ - \\iyh + zW - A(2)||2 - ur^ocp 

subject to u > 

Af ^ =0,n-k + l<i<n 

0< Af^ < 2, 1 < i<n- A;. (110) 

Let Vgen and A'^^"-* be the solution of (110). Set 

llWoenIb = O- — p 

, \s\\l'^ln-\W9enh + zW-X(9en)\\2 

Then: 

P(min(^prim(0-, g, h, X, Tsocp)) G (-E^4r!mV> g> h, r^ocp)) - ei'°''^^l^4r!mV> gi h, rsocp))| 



Wgenib = O- , (111) 



prim \"'o' ' aucpj/ "i I •^prmi 

E&:ii<r,g,h,r,ocp)) + e[-'WpTJi^,S,h,rsoc^^^^^^ (112) 



^prim \ ' o' ' *"<'/'// ' 1 \ >'prmi 

P{3y^socp\\\^socph G ((1 - ei'"''^)i?l|w,en||2, (1 + ei^°^^))i?|| W.enib)) > 1 " e'^^""'", (113) 

where ef"'^^ > /^ an arbitrarily small constant and e^"'^^ is a constant dependent on e^^'"^'^' and a but 
independent of n. 

Proof. Follows from the above discussion and Theorem 1 . D 



2.4.2 Optimal r 



socp 



In this section we design a particular choice of Vsocp that enables favorable performance of (4) as far as the 
norm-2 of the error vector is concerned (of course, the norm- 2 of the error vector is not the only possible 
measure of performance of (4)). To that end let us slightly change the objective of (110) in the following 
way 

4rim(^'g''^'^«ocp) = max -(o-v/ llglli - ||h+ zyz(l) - A(2)||2 - Vsocp) 

subject to u >^ 

\f^ =Q^n-k+l<i<n 

< Af ^ <2u,l<i<n-k. (114) 

Repeating the arguments between (68) and (70) one has that the following is equivalent to (114) 

CS('^'g'h,r^ocp) = max -{crJWgWl - l|h - ^^z^^^ + A(2)||2 - rsocp) 

subject to u > 

Af ^ =0^n-k+l<i<n 

0< Af^ < z^, 1 <z<n-A:. (115) 
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Set 



r 



(opt) 
socp 



aJiEWghr - E{\\h - ue,z(^) + M'^^hf, (116) 



where f^j and A^^^^ are as defined in Section 2.3.1. Clearly, 



(z,^^,A(^i)) = arg max J||g||2 - ||h - i/z(2) + A(2)||2, (117) 

,/>0,A(2)eAS'''»""' ^ 

where A^,^'^^"^ = {A(2)|0 < X^ < iy,l <i < n-k, X^ = 0,n - k + 1 < i < n}. Using further the 
arguments from Section 2.3.1 we have 



.{opt) 



fsocp = cr\/(a - ««;)"-, (118) 

where a„, is as defined in the fundamental characterization (72). Let w^en be w^en in Theorem 2 obtained 

tor Tsocp ^ fsocp ■ tnen 



E||w(-*)|b = a i^l|h-...z(2)+A(^^)|b ^ f^^ 

^'" v^Mg]b?^^T^F^^^^^^z^^^TA(^^^bF V « - "«> 

Now, let us consider Vgen and A^^*^") that are the solution of (1 10) obtained for Vsocp / Vs^^p ■ Since ui^ and 
^(^i) are optimal in the optimization in (1 17) we have 



|g||2-||h-i/^,z(2)+A(^i)||2= max J||g||2 - ||h - zyz(2) + A(2)||2 



max ./||g||2 _ ||h + i.z(i) - A(2)||2 > J||g||2 _ ||h + ^zW - ^^||2, (120) 

i.>o,A(2)eA^2;'"="^ ^ V ^^'^^ ^^^" 

where a£'^^"'' = {A(2)|0 < xf^ <2u,l<i<n-k, Af ^ = 0,n - k + 1 < i < n}. Finally we obtain 

i.||w(-*)|| i^||h-.,,z (2) + A(M||, _ E||h + ^z(^)-Ag||, 

''^^ V(^||g||2)2-(i?||h-^,,z(2)-A(^i)|b)2 - /^E\\ghr-E\\h + ^z(^)-^^i 

Since both ||wgen lb and ||wgen||2 concentrate one also has 

P(||w(°P*) II2 < ||w<,en||2) > 1 " e'^-^™", (121) 

where ewgen > is a constant independent of n. Roughly speaking (121) shows that if Tsocp / r°ocp then 
with overwhelming probability there will be a solution to the SOCP from (4), ^socp, such that ||wsocp||2 > 

W^^gen ||2- 

Now let us look at general x and the corresponding optimization problem (105). Let Tsocp = Tsocp in 
(105). Further, let v^p and Xup be the solution of (105) obtained for Tsocp = Tsocp ■ Then clearly. 



i 



1 \(2) n ,.{2)x 

(i5;||g|b)2-(i5;||h + iz(i)-^|b)2- V ^^i. > « = a V(« - a^)n. 



/ J '-~~ -'^t — ' socp 



i=n—k+l 



^up 
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(2) 

The nonnegativity of I'up and the components of Xup and x impUes 



1 a(2) 



(Ekh? - (£||h + :=^z<i) - -^||2)2 > riS = <r\/(a - a„K 



Finally one has 

E\\h + J^zW-^\\2 r^ — , , 

m^ph = ^ — , '^"^ ^ '^"^ ,,,, < <^J^^ = E\\^^;i:^h. (122) 

(£;J||g||2)2-(i?||h + JUz(i)-|i£||2)2 V«-«- 

Since all random quantities of interest concentrate we have the following lemma. 
Theorem 3. Assume the setup of Theorem 1. Let rgocp in (4) be 



.{opt) 
' socp — ' 

Then 



rsocp = r'iJcp' = (y^{a - a„)n. (123) 



/ OL,n (wsocp) 

P{\^socph < 'yJ ^^) > 1 - e-^1 ", (124) 

\j a — ayj 

where ^^""^^ > is a constant independent of n and q^ is as defined in fundamental characterization 
(72). Moreover, if r socp in (4) is such that 



r 



> rfP^J = a^{a-a^)n, (125) 



socp ■-' I socp 



then 



0.„,, (vJsQcp) 

P{3y^socp\\\y^socp\\2 > aj^^^)) > 1 - e-^2 «. (i26) 



where e]^'"''^' > is a constant independent ofn. 

Proof. Follows from the discussion presented above and Theorem 1. D 

2.4.3 Computing E\\wgen\\2 and -EQ^-^^ (a, g, h, rsocp) 

In this section we present a framework to compute Hw^enlb and Q^^JJ^ (a, g, h, r^ocp) or more precisely 

their concentrating points i^Hw^enlb and E^~'Jj^{a,g,h,rsocp)- AH other parameters such as Ugen, Xgen 
can (and some of them will) be computed through the framework as well. We do however mention right 
here that what we present below assumes a fair share of familiarity with the techniques introduced in our 
earlier papers [62,65]. To shorten the exposition we will skip many details presented in those papers and 
present only the key differences. 

We start by looking at the following optimization problem from (109) 



^primi(y^S,ti, rsocp) = max a^J\\g\\liy'^ - ||z/h + z(i) - A(2)||2 

/ > 

.(2) 



',A(2) 

subject to u > 

Af ^ =0,n-k + l<i<n 



< Af ^ <2,l<i<n-k. (127) 
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Using the definitions of h and tP''^ from Section 2.3.1 we modify the above problem in the following way. 



4rim(^'g''^'^*ocp) = max Cr'w'||g||^l/2 - ||i/h - z(2) + A(2) 



.p„m. .-. .---., 



subject to z/ > 

Af ^ =0,n-A; + l<i<n 

0< Af^ < 1,1 <i<n- A;. (128) 

Now, let A^^*^") be the solution of the above optimization (this is a slight abuse of notation since due to the 
above restructuring of h this A*-^*^"^ is different from the one in the above Theorem). Following what was 
presented in [65] there will be a parameter Cg^^n such that A^^^") = [A^^^"\ A2 " ) • • • > ^cg^n , 0; 0, . . . , 0] 
and obviously Cg^n < n — k. At this point let us assume that this parameter is known and fixed. Then 
following [65] the above optimization becomes 



max o-A/||g||^i/2 - ||z/hcg^„+i:„ - zf^^^^^.J\\l - vr. 



socp 



subject to u>0. (129) 

We then proceed by solving the above optimization over v. To do so we first look at the derivative with 
respect to u of the objective in (129). Computing the derivative and equalling it to zero gives 



du 



0- - = Tsocv (130) 



Let 



II \\2 II v» 112 

_ llSIb ~ II^Cgen+l:ra|l2 

(Igen — '^ 

^socp 

bgen = ^^£f!±i^L^£fIi±ii!i. (131) 

^socp 

Then combining (130) and (131) one obtains 

{agents + bgenf = M\W " ll^^hc^en+l:n " zif,„+l:„) ll^- (132) 

After solving (132) over v we have 



feL„+llz 



(2) 



(„ h -VJ' 7^^^ \-\ (n h -h'^ 7^^) ^2 "jenJ2^^n + l:«»2 

[UgenUgen "c„e„ + l;n'*c„e„ + l;nJ V \"'9enVgen "c„e„+l:ra'*c„e. 



1^ — 2 \\ — 1|2 iTu 1|2 

(133) 
Given the structure of a^en and bgen (133) can be simplified a bit. However, we find it more appealing to 
work with (133). Combining (128), (129), and (133) one obtains the following equation (rather an inequal- 
ity) that can be used to determine Cgen (essentially Cgen is the largest natural number such that the left-hand 
side of the equation below is less than 1; since we will assume a large dimensional scenario we will instead 
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of any of the inequalities below write an equality; this will make writing much easier). 



(agenbgen Kg,„+l:n'^Cgen+l:J ^ (agenbgen Kg,„ + l:n^Cger.+l:n)'^ Ken-|lg|l^+lihcgen+l:n||^)-i 



"sen + l|Zcgg„ + l:„ll2 



9 IIII2 111 Il2 

'^gen ~ llSib ~^ \\"-Cgen+l-n\\2 

(134) 
Let Cgen be the solution of (134). Then 






ge^ + l-.n'^cger. + l.n) ^ \"'gen^gen "Cgen+lin-^Cjen+l:"/ «„-Jlg|l2 + |lh,j,,„ + i:„|||)- 



^^^ ~ r,2 _ ||rrl|2 1 111; ||2 

"■qen II&II2 ' ll'^Cgen+l:n|l2 



gen 

(135) 



From (111) one then has 



^ap/n.'^r.^^^ -J-1 :n Z 



(2) 



,, VnJ-lcgen + bn -*cgen+l:nll2 ,^„,, 

||Wgen||2 = O- I .ox " <-13°) 

/ll l|2 2 II U v ' ||2 

Y llSlb'^sen ~ \\^gen"-Cgen, + l:n — '^Cge„+l:n\\'2 

Combination of (134), (135), and (136) is conceptually enough to determine ||wgen||2- What is left to be 
done is a computation of all unknown quantities that appear in (134), (135), and (136). We will below show 
how that can be done. As mentioned earlier what we will present substantially relies on what was shown 
in [65] and we assume a familiarity with the procedure presented there. 

The first thing to resolve is (134). Since all random quantities concentrate we will be dealing (as in [65]) 
with the expected values. To compute Cgen in (134) we will need the following expected values 

i^||gi,i^||h,^,„+i:„i,i?(h^^^„+,,,z2)„^.iJ. (137) 

Clearly, since components of g are i.i.d. standard normals one easily has 

EM\l = m. (138) 

Let Cgen = (1 — G)^ where 6* is a constant independent of n. Then as shown in [65] 









where we of course recall that 13^ = -. Also, as shown in [65] 



-^(h^g.n+i;n^cg,„+i:«) _ (f-. _o V /l,-(erfinv(^))2 



n^oo n \ V VT / 

The only other thing that we will need in order to be able to compute Cgen (besides the expectations from 
(137)) is the following inequality related to the behavior of h.cg^„. Again, as shown in [65] 

P(^/2erfinv((l + e^-")(i^)) < h,,^J < e^'^^™", (141) 

where e^"^"" > is an arbitrarily small constant and €2"^"" is a constant dependent on e-^"^"" but indepen- 
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dent of n (essentially one only needs this direction in (134); however, a similar reverse holds as well). 

At this point we have all the necessary ingredients to determine Cgen and consequently Ug^n and ||wge„||2 
(of course in a random setup determining Cgen, ^gen^ and Hw^enlb does not really make sense; what we really 
mean is determining their concentrating points). The following corollary then provides a systematic way of 
doing so. 

Corollary 1. Assume the setup of Theorems 1 and 2. Let h be as defined in (69) and let rJocp = linin-^oo ^""S' . 



m „„ J Q _ k 



Let a = '^ and /?«, = - be fixed. Consider the following 



_ ^J^i^rfinv{^)f 

Ettgen _ \ ^ '1^3^'' J _ " - D{d) 



a 



'-^ v2-^^2 ^rr j^ -vr.^ -p. 



A{0) = lim -:p^ = a ^ p^ '- = a , , 

n->oo ,/n ^(*c) Jsc) 



fsocp fsocp 

fn a \ [2 -(erfinv(\^^)f\ 

«,,, , Ehg^n (a-/g^)V^^ ''-'-'') c{e) 

B{6) = hm -^ = a j-. = a- 



' socp I socp 



F{e) = V2erfinv{^—^), (142) 

-t Pw 



where 

C{e) = lim ^ a-^)»+^-" (l-.;n+x:n. ^ . ^^ _ ^^ , ^^_,,.,„.. ,_ 



(i-e)„+i:„^(i-e)„+i:„; ^hi_ /3^^ lp-(erfinv{^)r 



i^l|ha-.)...JIi ^ 1^ /^^^V^(-f-(i^Z_v^± 



D{e) = Jim " ^-"J"^— = ^_pi I V27r + 2 V ^^^^^,, ;_;;, - V2vr:,^^ | + /3^. 



"^^ n ^2^ \ (er>v(^))2 1-/3^ 



(143) 
Let 9 be the solution of 



^^^^ A{eY-a + D{e) = ^- ^^^^^ 

Then the concentrating points ofvgen, \\'^gen\\2> <^nd Cfrimi'^^ S' ^^ ^socp) in Theorem 2 can be determined 
as 



^^gen 



{A{e)B{e) - c{e)) - ^{A{e)B{e) - c{e)f - {B{ef + e){A(ef - « + D{e)) 

A{ey-a + D{e) 



^„ „ / {EiygenyDi9)-2EiygenCie) + e 
E\\Wgen\\2 = <^ \ 



a{Eugenf - {{EugenYD{e) - 2EugenC{e) + 9) 

^^^^ ^pr^^y .^_ , socpj ^ ^^^(EUg^n? " {{EUg,n?D{e) - 2EUg,nC{9) +6)- EUg,nrt%. (145) 



Proof. Follows from Theorem 2 and the discussion presented above. D 

The results from the above corollary can be then used to compute parameters of interest in our derivation 
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Table 1: Experimental/theoretical results for the noisy recovery through SOCP; Vgocp 
was run 500 times with n = 400; (127) was run 500 times with n = 1000 



m, a 



i;(4) 



a 


Pw/a 


^^gen 


i?4«™'(i,g,h,v^) 


E\\'Wgen\\2 


Efobj 
\pn. 


-E' Wsocp 2 


0.3 


0.1 


0.5353/0.5333 


0.0872/0.0866 


1.0194/1.0103 


0.0870/0.0866 


1.0237/1.0103 


0.3 


0.15 


0.5867/0.5846 


0.1388/0.1369 


1.4710/1.4322 


0.1393/0.1369 


1.4543/1.4322 


0.3 


0.18 


0.6199/0.6157 


0.1747/0.1717 


1.8685/1.7746 


0.1711/0.1717 


1.7767/1.7746 


0.5 


0.1 


0.5767/0.5761 


0.1037/0.1046 


0.8960/0.9005 


0.1032/0.1046 


0.9024/0.9005 


0.5 


0.2 


0.6919/0.6899 


0.2278/0.2268 


1.5989/1.5790 


0.2285/0.2268 


1.5907/1.5790 


0.5 


0.25 


0.7557/0.7509 


0.3080/0.3027 


2.2099/2.1006 


0.3047/0.3027 


2.1502/2.1006 


0.7 


0.15 


0.6713/0.6710 


0.1808/0.1819 


1.0875/1.0902 


0.1812/0.1819 


1.0909/1.0902 


0.7 


0.22 


0.7565/0.7555 


0.2818/0.2809 


1.5086/1.4963 


0.2804/0.2809 


1.5062/1.4963 


0.7 


0.3 


0.8663/0.8624 


0.4210/0.4170 


2.2136/2.1476 


0.4219/0.4170 


2.1773/2.1476 



for particular values of /?^, a, a, and Vgocp- We conducted massive numerical experiments and found that the 
results one can get through them are in firm agreement (as they should be) with what the presented theory 
predicts. This paper is above all intended to be an introductory presentation of a framework for the analysis 
of the SOCP algorithms and we therefore refrain from a substantial discussion related to the results obtained 
through the numerical experiments and their agreement with the theory. We instead defer such a discussion 
to several forthcoming papers. Just to give an idea how powerful the introduced mechanism is we, in the 
next subsection, present only a small sample of the conducted numerical experiments. 



2.4.4 Numerical experiments 

Using (142), (143), (144), and (145) one can then for any rgocp, any a, and any pair (a, /3^) (that is below 
fundamental characterization (72)) determine the value of -EHvi^socplb as well as the concentrating points of 
all other quantities in our derivations. We will split the presentation of the numerical results in four parts. To 
demonstrate the precision of our technique in the first couple of experiments we will run both SOCP from 
(4) as well as (127). In some of the later experiment sets we will instead focus solely on SOCP from (4) 
whose performance analysis is actually the leading topic of this paper. 

1) Random examples from low {a, (3^) regime 

Under low (a, /J^y) regime we consider pairs (a, /?^) that are well below the fundamental characteriza- 
tion (72). We ran 500 times (127) for a = {0.3, 0.5, 0.7}, n = 1000, a = I, and Vsocp = \fr(i = \/mi and 
various randomly chosen values of /?^. In parallel, we ran 500 times (4) with the same parameters, except 
that (4) was run for n = 400. Also, since the non-zero components of x can not really be made infinite 
we set them to be ^ when generating (3) (we could/should have set them higher but this already works 

fairly well). The results we obtained for Evg^n, -^Cprim V, g, h, Vsocp), -E||wgen||2, Efohj, and E\-^socp\\2 
through these experiments are presented in Table 1 . The theoretical values for any of these quantities in any 
of the simulated scenarios are given in parallel as bolded numbers. We observe a solid agreement between 
the theoretical predictions and the results obtained through numerical experiments. 

2) Specific examples in low (a, /3^) regime 



a) r. 



socp 



Vsocp = a^ia - Ciyj)n 



We also ran a carefully designed set of experiments intended to show a specific behavior of the SOCP 
from (4) and the above theoretical predictions. Namely, for a pair (a, /3„,) instead of choosing Vsocp as 
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Table 2: Experimental/theoretical results for the noisy recovery through SOCP; rsocp 
(4) was run 200 times with n = 400; (127) was run 500 times with n = 5000 



V0.2m, 



a 



a 


I5yj/a 


^^gen 




E\\'Wgen\\2 


Efobj 


-E Wsocp 2 


0.3 


0.21 


0.7617/0.7610 


0.0008/0 


2.0325/2 


-0.0051/0 


2.0201/2 


0.5 


0.27 


0.9800/0.9778 


0.0007/0 


2.0199/2 


0.0045/0 


2.0463/2 


0.7 


0.33 


1.2570/1.2565 


0.0011/0 


2.0158/2 


-0.0080/0 


2.0036/2 



1; 



Table 3: Experimental/theoretical results for the noisy recovery through SOCP; 



{\/0.2to, \/0.6m, y/m}, u = 1; (4) was run 200 times with 



' socp 



n 



400 



a 


fiyj/a 


Tsocp = V0.2m 


rsocp = \/0.6m 


Tsocp = \/m 


Efobj 


EWWsocph 


Efobj 


E\\Wsocp\\2 


Efobj 


E\\Wsocp\\2 


0.3 


0.21 


-0.0051/0 


2.0201/2 


0.1332/0.1295 


2.2235/2.0943 


0.2178/0.2120 


2.4794/2.2639 


0.5 


0.27 


0.0045/0 


2.0463/2 


0.2152/0.2092 


2.2245/2.1495 


0.3399/0.3377 


2.4570/2.3884 


0.7 


0.33 


-0.0080/0 


2.0036/2 


0.3095/0.3048 


2.2995/2.2190 


0.4877/0.4847 


2.5779/2.5394 



'm = 
v) we chose r, 



an (which is, as discussed in Section 1, how one could do it if solely based on statistics of 
socp = cJY^(a — aw)n, where a^ is the one that corresponds to /3^ in the fundamental 
characterization (72). As discussed in Section 2.4.2 this choice could in certain sense be optimal. Moreover, 
as discussed in [62] this choice of rgocp should make the norm-2 of the error vector in (4) no worse (larger) 
than the one that can be obtained via a couple of LASSO algorithms considered in [62]. We then considered 
the contour LASSO line from [62] that corresponds to the norm-2 of the error vector equal to 2 and from 
that line we chose three pairs (a, /3^) (see Table 2) for which we then ran (4) (for the completeness and 
easiness of following we present the LASSO contour lines again in Figure 3; in fact as argued in Section 
2.4.2 and [62] with Vgocp as above the performance of SOCP from (4) can also be characterized by these 
lines, i.e. it is not really necessary to refer to them as LASSO contour lines, one may as well refer to them 
as SOCP contour lines!). Now, further, we will again set a = 1. Based on results of [62] it is then easy to 
see that on the contour line that corresponds to the norm-2 of the error vector equal to 2, Vgocp = \/0.2m. 
We ran (4) 200 times with n = 400. We also in parallel for the same set of parameters ran (127). To get a bit 
better concentration results we ran (127) 500 times with n = 5000. Obtained results are presented in Table 
2. The theoretical values for any of the simulated quantities in any of the simulated scenarios are again given 
in parallel as bolded numbers. We again observe a solid agreement between the theoretical predictions and 
the results obtained through numerical experiments. 
b) Varying rsocp from V0.2m to y/m 

we conducted a set of 
We varied 



socp- 



To observe how the norm-2 of the error vector changes with a change in rgocp 
experiments where we chose the same three pairs (a, (3^) as in the previous set but varied r 
rsocp over set {\/(L2m, \/0.6m, ^/m}. This time we only focused on SOCP and ran only (4). We ran (4) 
200 times with n = 400. The obtained results are presented in Table 3. Again, the theoretical predictions 
are given in parallel in bold. We again observe a solid agreement between the the theoretical predictions and 
numerical results. Also, from Table 3 one can see that as Vsocp decreases from ^/m to ^/Q.2m, E\\-Wsocp\\2 
decreases as well. 

2) Specific examples in high (a, /3^) regime 



a)r, 



socp 



Sopt) 
' socp 



<y^{ 



a 



a. 



,)n 
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Table 4: Experimental/theoretical results for the noisy recovery through SOCP; rsocp 
(4) was run 200 times with n = 2000; (127) was run 200 times with n = 10000 



\/0.1m, (T = 1; 



a 


Pw/a 


^^gen 


<'.:r(l-g-hy0.1m) 


EW^^genh 


Efobj 


E\\'Wsocp\\2 


0.3 


0.249 


0.8005/0.7995 


0.0024/0 


3.1780/3 


0.0011/0 


3.1956/3 


0.5 


0.325 


1.0574/1.0552 


-0.0016/0 


3.0300/3 


0.0004/0 


3.0154/3 


0.7 


0.41 


1.4203/1.4193 


0.0017/0 


3.0481/3 


0.0002/0 


3.0147/3 



Table 5: Experimental/theoretical results for the noisy recovery through SOCP; 



{VO.lm, \/0.5m, \/m}, u = 1; (4) was run 200 times with 



' socp 



n 



2000 



a 


Pw/a 


fsocp - 


= VO.lm 


rsocp = \/0.5m 


fsocp - 


= \/rn 


Efobj 


-E Wsocp 2 


Efobj 


E\\Wsocp\\2 


Efobj 


EWWsocph 


0.3 


0.249 


0.0011/0 


3.1956/3 


0.1613/0.1639 


3.2050/3.1710 


0.2763/0.2792 


3.5261/3.5053 


0.5 


0.325 


0.0004/0 


3.0154/3 


0.2757/0.2722 


3.4015/3.2840 


0.4623/0.4576 


3.9177/3.7774 


0.7 


0.41 


0.0002/0 


3.0147/3 


0.4143/0.4145 


3.4878/3.4563 


0.5530/0.6857 


4.3548/4.1603 



We also ran a carefully designed set of experiments intended to show a specific behavior of the SOCP 
from (4) and the above theoretical predictions in "high" (q, (3,a)) regime (under "high" (a, (3w) regime we of 
course assume pairs of {a, /3^) that are relatively close to the fundamental characterization). We again for 
a pair (a, /?„,) instead of choosing Vsocp as ^/m = ^/an chose it based on the SOCP/LASSO contour lines. 
This time, though, we considered the contour line from [62] (or Figure 3) that corresponds to the norm-2 
of the error vector equal to 3 and from that line we chose three pairs (q, /3^) (see Table 4) for which we 
then ran (4). As usual to make the scaling smoother we set a = I. Based on results from Section 2.4.2 
and [62] it is then easy to see that Vsocp = \/oTm. To get a bit better concentration results (the pairs of 
(a, /S^) are now fairly close to the fundamental characterization) we ran (4) 200 times with n = 2000 and in 
parallel we ran (127) 200 times with n = 10000 for the same set of other parameters. The obtained results 
are presented in Table 4. The theoretical values for any of the simulated quantities in any of the simulated 
scenarios are again given in parallel as bolded numbers. We again observe a solid agreement between the 
theoretical predictions and the results obtained through numerical experiments. 

b) Varying r socp from \/0.1m to y/m 

We also conducted a set of high regime experiments that are analogous to the varying rsocp in the lower 
regime. We maintained the structure of the experiments as in the lower regime with a different way of 
choosing three pairs (a, /3^). As above we chose them from the LASSO contour line that corresponds to 
the norm-2 of the error vector that is equal to 3 (this is of course the same as in Table 4). Again, as above 
one has Vgocp = crY^(a — a^)n = \/0.1m (we again for the simplicity of scaling assume a = 1). We then 
varied Vsocp over set {\/0.1m, \/0.5m, ^/m} and again focused only on SOCP and ran (4). We ran (4) 200 
times with n = 2000. The obtained results are presented in Table 5. The theoretical predictions are given in 
parallel in bold. The results obtained through numerical experiments are again in a solid agreement with the 
theoretical predictions. Also, as it was the case in lower regime, one can see again that as rsocp decreases 
/m to \/0.1m, E\\'Wsocp\ 



1 2 decreases as well. 



from 

4) SOCP contour lines 

As mentioned earlier for any pair (a,/?^) there is a particular choice of rsocp such that the "generic" 
(worst-case) norm-2 of the error vector of the SOCP from (4), ||vif socp ||2» is the smallest. Moreover, as 
shown in [62] for such a choice of rgocp ||wsocp||2 can be made as small as the corresponding ||w/asso||2 
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of the LASSO algorithms considered in [62]. Namely, for rsocp 
scenario) E'llw^ocplb 



E^llw; 



asso\\2 



a 



-. Let p 



o^{oL — aw)n one has (in a generic 
-. Then for different values of p one 

has the contour lines in (a, /3^) plane below which with overwhelming probability Hvi^socplb < crp. Clearly 
all the contour lines are achieved if the SOCP from (4) is run (for any (a, /3^) from the contour line) with 
a°p _ rsocpip) = o"Y^(a — aw)n = a . j^n. In Figure 4 we show what impact on the contour 



lines has a change of optimal rgocp- For the concreteness, instead of choosing rgocp 



socp 



(p) 



l+p- 



^n 



we chose r, 



ocp 



o^fom. As can be seen from the plots, as Vgocp increases from a 



i+p- 



■n to a\/an the 



contour lines that guarantee the same p = E\\wsocp\\2/o' ratio go down. However, the difference is more 
pronounced in high a regime (the difference in rgocp is of course more pronounced in that regime as well; 
rsocp is proportional to an). 



(a,B ) curves as functions of p=||w IL/a, SOCP 




Figure 3: (a, /3^) curves as functions of p 

fsocp\P) = ^-y/ 1+^ 



for the SOCP algorithm from (4) run with Vsocp 



■n 



3 SOCP's performance analysis framework - signed x 

In this section we show how the SOCP's performance analysis framework developed in the previous section 
can be specialized to the case when signals are a priori known to have nonzero components of certain sign. 
All major assumptions stated at the beginning of the previous section will continue to hold in this section as 
well; namely, we will continue to consider matrices A with i.i.d. standard normal random variables; elements 
of V will again be i.i.d. Gaussian random variables with zero mean and variance a. The main difference, 
though, comes in the definition of x. We will in this section assume that x is the original x in (3) that we are 
trying to recover and that it is any A;-sparse vector with a given fixed location of its nonzero elements and 
with a priori known signs of its elements. Given the statistical context, it will be fairly easy to see later on 
that everything that we will present in this section will be irrelevant with respect to what particular location 
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~s0.5 



(a,B ) curves as functions of p=||w IL/a 

1 'r-w' f^ II S0Cp"2 




Figure 4: Deviation of (a, /3^) curves; solid lines are for the SOCP from (4) run with r 
(7. / " ^ n; dashed lines are for the SOCP from (4) run with Vgocp = cry/an 



socp 



' socp 



ip) 



and what particular combination of signs of nonzero elements are chosen. We therefore for the simplicity of 
the exposition and without loss of generality assume that the components xi, X2, . . . , x„_/s of x are equal 
to zero and that the remaining components of x, DCn-k+i,^n-k+2, ■ ■ ■ , x„, are greater than or equal to zero. 
However, differently from what was assumed in the previous section, we now assume that this information 
is a priori known. That essentially means that this information is also known to the solving algorithm. Then 
instead of (4) one can consider its a better ("signed") version 



min 

X 



ll^lll 

" — Ax\\2 < r 



subject to ||y 

Xi > 0, 1 < z < n. 



(146) 



Of course given the positivity of Xj, 1 < z < n, one can replace ii norm in the objective by the sum of all 
elements of x. However, to maintain visual similarity between what we will present in this section and what 
we presented in Section 2 we will keep the £i norm in the objective. Along the same lines, in what follows 
we will mimic the procedure presented in the previous section, skip all the obvious parallels, and emphasize 
the points that are different. To make the analysis of the "signed" case as parallel as possible to the analysis 
of the "general" case we will again for the analysis purposes modify the objective of the above optimization 
problem so that it becomes 



min 



subject to 



l|x||i - ||x||i 

||y - Ax.\\2 < rsocp+ 
Xj > 0, 1 < i < n. 



(147) 



36 



One should again note that this modification of (146) is for the analysis purposes only, i.e. (147) is not 
the algorithm one would be running while searching for an approximation to x (similarly to (9), (147) 
can not be run anyway, since it requires knowledge of ||x||i which, of course, is unavailable). The SOCP 
algorithm one would actually use to find an approximation to "signed" x is the one in (146) (of course with 
r = rsocp+)- It is just for the easiness of the exposition that we will look at the modification (9) and not 
at the original problem (4). Also, one should again note that r in (146) or rsocp+ in (147) is a parameter 
that critically impacts the outcome of any SOCP type of algorithm (again, for different r's one will have 
different SOCP's). The analysis that we will present assumes a general r that we will call rsocp+- As it was 
the case in Section 2, we will in later subsections (basically when the analysis is done) comment in more 
detail on the effect that choice of rsocp+ has on the analysis or, more importantly, on the performance of 
the optimization algorithm from (146). Right here, we do mention that problem (147) is not feasible for 
all choices of x, a, /3+, a, and rsocp+- What we present below assumes that x, a, /3+, a, and rsocp+ are 
such that (147) is feasible with overwhelming probability. For example, a statistical choice rsocp+ > (y\pm 
guarantees feasibility with overwhelming probability. Of course, there are other choices of parameters x, 
a, /5+, a, and rsocp+ that guarantee feasibility as well. However, since our primary goal in this paper is 
to present a framework that can be used to analyze (147) when it is feasible we refrain from a substantial 
discussion about the feasibility of (147) and defer it to one of the forthcoming papers. 

Given that we will be dealing with (147) let us define the optimal value of its objective in the following 
way 

/ofe,-+ = mill ||x||i - ||x||i 

X 

subject to ||y - Ax||2 < Vgocp^ 

Xj > 0, 1 < i < n. (148) 

Clearly, /ofej+ is a function of u, x, A^ v. To make writing easier we will adopt the same convention as in 
Section 2 and omit them. As in the previous section, the framework that we will present below will again 
center around finding fobj+- We will first create an upper bound on fobj+ (this will essentially amount to 
creating a procedure that is analogous to the one presented in Section 2.1). We will then afterwards create a 
mechanism analogous to the one from Section 2.2 that can be used to establish a lower bound on fobj+- Of 
course, as it was the case in Section 2, all these bounds, as well as the entire analysis, will be probabilistic. 

3.1 Upper-bounding fohj+ 

In this section we present a general framework for finding a "high-probability" upper bound on fobj+- As 
usual, we start by noting that if one knows that y = Ait + v holds then (148) can be rewritten as 

mill ||x||i — ||x||i 

X 

subject to II V + Ax — Ax||2 < rsocp+ 

Xj > 0, 1 < i < n. (149) 

Change of variables, x = x + w, transforms (149) to 

min ||x + w||i — ||x||i 

w 

subject to II V — Aw II 2 < rsocp+ 

Xj + Wj > 0, 1 < i < n, (150) 
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or in a more compact form to 



ruin ||x + w||i — ||x||i 






subject to II A 



V 



< 



1"socp+ 



a 
Xj + Wj > 0, 1 < i < n, (151) 



where as in Section 2 A^ = [—A v] is an m x (n + 1) random matrix with i.i.d. standard normal 
components. Now, let Cw^p^ be a positive scalar. Then the optimal value of the objective of the following 
optimization problem is an upper bound on fobj+ 



mm ||x + w||i — ||x||i 

w 



W 






2 _; 'fsocp-\- 



iiwi<c^ , 

II II Z — W^ip_|_ 

Xj + Wj > 0, 1 < i < n. (152) 

One can then proceed by solving the above optimization problem through the Lagrange duality. However, 
instead of doing that we recognize that (152) is the same as the first equation in Section 4.2 in [62]. One can 
then repeat all the steps from Section 4.2 in [62] until the second to last equation before Lemma 14 to obtain 

n 

-/(;g = - min max ((z^ - 2A(2))^ - ^(i)A)a - z.«va + ||z.(i)||2r,„,,+ + 2 V A^i, 

subject to Af ^ < 0, 1 < i < n, (153) 

where we recall that z^^' is an n dimensional vector of all ones, A*^^-* and u^^' are n and m dimensional 
vectors, respectively, of Lagrange variables, and obviously — /^S is the optimal value of (152). If we can 

establish a "high-probability" lower bound on /q^|'_| we will have a "high-probability" upper bound on the 
objective value of (152). To do so, we will proceed as in Section 2, though in a slightly faster manner. Set 
A(2+) = {A(2)|Af ^ > 0, 1 < i < n} and 

e„p+(a,g,h,i,r,„,p+,Cw„,+ )= ,^ ^ min (C^ ||^h + (z« - A(2))||2 - (ei*^) + ef )^r. 

n 

i=n—k+l 

Then the following lemma that shows that ^up+if^, g, h, x, rsocp+, Cwup+) ^^ ^ Lipschitz function concen- 
trates around its mean is a literal analogue to Lemma 3. 

Lemma 9. Let g and hbe m and n dimensional vectors, respectively, with i.i.d. standard normal variables 
as their components. Let a > be an arbitrary scalar Let ^up+i(^, S, h, x, rsocp+, C'wup+) bs as in (154). 
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Further let eup > be any constant. Then 

P{\E,up+{(y, g, h, X, rsocp+, C^^p^)-E^up+{a; g, h, x, rsocp+, Cw„j,+ )| > e/ip|£^?«p+(cr, g, h, x, rsocp+, Cw„j,+ )|) 

/ ieiipE^up+icr,g,h,i,rsocp+,C^^^^)f\ 



Proof. The proof is literally the same as the corresponding one from Section 2. The only difference is that 
one now has A(^+) instead of A^^). This difference though changes nothing in the key arguments used in the 
proof of Lemma 3. D 

Let ^Vp+ and A„„_,_ be the solutions of the optimization in (154). One then has that ||z4^h + z^^^ — 
A„p^||2 and f„p+ concentrate as well. More formally, one then has the following analogues to (155) 

-nnw ^1 , d) \(2) II 7^11^ ^1 , a~) \(2) II 1^ inormup) -r^u .. , (I) ,(2) n n ^ _(normup) 

p{\ia:;+ - Eij:^+\ > ep^+^^i?;:;;;) < e^^^ ™, (i56) 

where as usual g^"°''™"*'-' > g and e^'"''^ > are arbitrarily small constants and g^"°''™"P-' and c^^^^ are 
constants dependent on g!^"°''™"P^ > q and e^"^"^^ > 0, respectively, but independent of n. Repeating the 
arguments between (15) and Lemma 4 one then obtains the following "signed" analogue to Lemma 4. 

Lemma 10. Let w be an n x 1 vector ofi.i.d. zero-mean variance a"^ Gaussian random variables and let 
A be an m X n matrix ofi.i.d. standard normal random variables. Consider an x defined in (6) and a y 
defined in (3) for x = x. Let then fobj+ be as defined in (148) and let w be the solution of (152). There is a 



constant eupper > such that 



where 



P{U,+ < f^f^'^^) > 1 - e— ", (157) 



fobf+ = -Eiup+ (o-, g, h, X, rsocp+ , Cw„p+ ) + eiip \ E^up+ (o", g, h, x, rsocp+ , C'w„p+ ) | + e^^' y/n+ ef' y/n, 

(158) 
^„p4-(cj, g, h, x,rsocp+5 C'w„p+) is as defined in (154), eiip,e\ ,£3 are all positive arbitrarily small con- 
stants, and Cw j_ is a constant such that H'wlk < C^^j .. 

Proof. Follows from the discussion preceding Lemma 4. D 

3.2 Lower-bounding fobj+ 

In this section we present the part of the framework that relates to finding a "high-probability" lower bound 
on fobj+- As in Section 2, to make arguments that will follow less tedious we will here assume that there is 
a (if necessary, arbitrarily large) constant Cw such that 

P{\\^socp+h < Cw) = 1 - e-^^w", (159) 
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where of course •Wsocp+ is the solution of (4). Now we will look at the following optimization problem 



mm 

X 



Axil 



u- t t II II II ~ II ^ ,.(lower) 

subject to ||x||i - ||x||i < f^i^.^ 
Xj > 0, 1 < i < n. 



(160) 



If we can show that for certain f^^^^ with overwhelming probability the objective of (160) is larger then 

i'socp+, then f^^^^ will be a "high-probability" lower bound on the optimal value of the objective of (148), 

i.e. on fobj+- Hence, the strategy will be to show that for certain f^^^^ the optimal value of objective in 
(160) is with overwhelming probability lower bounded by a quantity larger than rsocp+- We again start by 
noting that if one knows that y = Ax + v holds then (160) can be rewritten as 



mm 



|v + Ax- Axil 



subject to ||x||i 

Xj > 0, 1 < i < n. 



-11 ^ j,( lower) 
X||l < fobj+ 



Replacing x = x + vi^ back in (161) we have 



mm 

w 



Aw\ 



subject to 



l|x + w||i 

Xj + Wj > 0, 1 < i < n, 



-11 ^ j,( lower) 
X||l < fobj+ 



or in a more compact form 



mm 

•w 



|Av 



subject to 

where Av is as in the previous subsection. Set 

Cobj+ = mill 

w 

subject to 



E^ j,(lower) 



lobj- 



i=l 



Xj + Wj > 0, 1 < i < n, 



|Av 



'obj- 



E, j,( lower) 
1=1 

Xj + Wj > 0, 1 < i < n. 



Let 



S^[cr, X, Cw, fobj+ ) — { 



nn+li II II ^ ,^ J V^ ^ /.(lower) 

£ R ^ \ ||w||2 < Cw and 2^Wj < f^^^.^ 



i=l 



(161) 



(162) 



(163) 



(164) 

and Xj+Wj > 0, 1 < i < n}. 
(165) 
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Set 



Ahelp) 



mm 



I A 



.^<x]Te5+(a,x,Cw,/il°7''') 



w 



min max a A^ 

[wT<x]Te5+(a,x,Cw,/i;,°7'-^)ll^ll2=l 



and 



(166) 






mm 



w 



+ f^^l|g||2 + ^h, 



Wi 



j=l 



(167) 



As in Section 2, since Cw is not a parameter of substantial interest in our derivations we will again omit it 
from the list of arguments of ^+. Before establishing probabilistic arguments related to lower-bounding of 
(166) we will first in Section 3.2. 1 establish a deterministic result related to the optimization of ^+ {a, g, h, x, f\ 

We will then in Section 3.2.2 find that ^+(cr, g, h, x, f^^^^) concentrates and afterwards return to the prob- 
abilistic analysis of (166). 



(lower)\ 
obj+ ) 



3.2,1 Optimizing ^+ (cr, g, h, i, /iJ°+^''-' ) 

In this section we find ^+(o-, g, h, x, f^^^^ )■ First let us rewrite the optimization problem from (167) in 
the following form 



^+(cj,g,h,i,/^j,°7''^) 



mm 



rw 



+ ^2||g||2 + X]'^' 



Wi 



i=l 



subject to 



E, /.(lower) 



lobjA 



j=l 



Xj + Wj > 0, 1 < i < n 



|w||2 + a2< VCi + ^2. 



(168) 



From this point one can proceed with solving the above problem through Lagrangian duality. However, in- 
stead one can recognize that the above optimization problem is fairly similar to (169) in [62]. The difference 
is only in the constant term in the first constraint. After carefully repeating all the steps between (169) and 
(178) in [62] one then arrives at the following analogue to (178) from [62] 



^+(cj,g,h,x,/^j,.^ ') = max 



igii2+7)'-iih+z.z(i)-A(2)||2- Y, ^^i^-^VC^+CT^-uf, 

i=n— fc+1 



(lower) 
obj+ 



subject to u > 

Af ^ > 0, 1 < i < n 

||g||2 + 7- ||h + zyzW-A(2)||2>0 
7> 0. 

To do the maximization over 7 we set the derivative to zero 

llslb + 7 



|g||2+7)2-||h + J/za)-A(2)||2 



v'ciT 



a^ 



(169) 



(170) 
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and after some algebra find 



lopt+ = Jl + -^\\h + UZW - A(2)||2 - ||g||2, 



(171) 



where of course, as in Section, 2 7opi+ would be the solution of (169) only if larger than or equal to zero. 
Alternatively of course 7opi+ = 0. Now, based on these two scenarios we distinguish two different opti- 
mization problems: 

1. The "overwhelming" optimization 



U+(^,g,h,x,/ig7^)) = max a^||g||i-||h + .z(i)-A(2)||i- J] A? ^x, - ./(g^^) 



max a\ g n 
subject to u > 



i=n—k+l 



aF > 0, 1< i < n. 



2. The "non-overwhelming" optimization 



(172) 



(2) ~ j.(lower) 



U.+ (^,g,h,i,/l;;7'0 = max Vcl+^hh - Cw||h + i/z« - X^'^h - ^ xf^^ - ^/, 



subject to u >0 

.(2) 



obj- 



i=n—k+l 



XT' > 0, 1 < i < n. (173) 

The "overwhelming" optimization is the equivalent to (169) if for its optimal values z/+ and A(2+) it holds 

(174) 



'l + ^||h + I.+ Z«-A(2+)||2<||g||2. 



We now summarize in the following lemma the results of this subsection. 

Lemma 11. Let i/+ and A(2+) be the solutions of (172) and analogously let zv+ and A(2+) be the solutions 
of (173). Le 
(168). Then 



of (1 73). Let ^+ (a, g, h, x, f^^^^ ) be, as defined in (167), the optimal value of the objective function in 



C+(o-,g,h,x,/^ 



{lower) \ 
obj+ ) 



(T\l |Ib-||2 - IIVi -^7^z{l) - A(2+)||2 _ V" A*^^^X- - ,yf('°""''') 

CTA/ ||g||2 \\n + U ZW A^ '\\2 Lji=n~k+l\ ^» '^Jobj+ ' 



■ n ||h+iy+z(l)-A(2 + )||2 

V , -(-1) 



< 1 



l+T 



I v'Ci^T^llglb - Cwllh + ZV+zW - A(2+)||2 - Er=n-..+l -. 

Moreover, let w+ be the solution of (167). Then 



X) ^^Xj - vfl^j^''\ otherwise 
(175) 



' obj- 



^(a,g,h,x,/ig7'-)) 



cr(h+i^+z(l)-A(2+)) 

Cw(h+t^+z(i)-A(2+)) 
L ||h+!7Fz(i)-A(2+)||2 ' 



, if Jl + ^||h + l/+Z«-A(2+)||2< ||g| 



a 



otherwise 



(176) 
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and 



|^(a,g,h,i,/ig7^))|b 



.||h+.^za)-A(^+))||. ^ .^ /r7^||h + ^z«-A(^)||2<||g| 



Vl|g|li-|lh+I.+z(l)-A(2+)|l2 

Cw, otherwise 

(177) 



Proof. The first part follows trivially. The second one follows the same way it does in Lemma 2 in [62]. D 

3.2.2 Concentration of ^+ {a, g, h, x, /^fe°+^'' ) 

In this section we establish that C+((T, g, h, x, f]^^^'^ ) concentrates with high probability around its mean. 

Lemma 12. Let g and hbem and n dimensional vectors, respectively, with i.i.d. standard normal variables 

'obj- 



as their components. Let a > be an arbitrary scalar Let S,+ {cr, g, h, x, f^f^j^^) be as in (167). Further 
let eiip > be any constant. Then 

P(|e+(^, g, h, i, /igr^) - EC^ia, g, h, i, /ig^))! > eu,\EC+{a, g, h, x, fi--^)\) 

fa,E^^(a,g,h,x,/gr)))^ ^ 



Proof. The proof is the same as the proof of Lemma 4 in [62]. The only difference is the structure of set 5+ 
which does not impact substantially any of the arguments in the proof presented in [62]. D 

One then has that ||h + i/+z(^) — A(2+)||2, ||h + z^+z(^) — A(2+)||2, J^+, and z/+ concentrate as well which 
automatically implies that w+ also concentrates. More formally, one then has the following analogues to 
(178) 

P(|||h + z;^z«-A(^)||2-i^||h + iAz«-A(2^)||2|>ei"°"")i?||h + j;^z«-A^^ < ^"4"°™'" 

P(|||h + z7fz«-A(2^)||2-^||h + i7fz«-A(2^)||2|>e5"°"")^||h + zA^^^ < 

P(|zA - E^\ > e^^'^E^ 

P{\^ - E^\ > e^^^E^^ 

i^(|||w+||2 - ^^llv^lbl > e-^^E^^f^ 



^ 


e 2 




(norm) 


< 


e-^4 


< 


e-r- 


< 


,-A^^n 


< 


e-4-'", 



(179) 



where as usual ^^"^ > 0, e^^°^"^' > q, e^ > 0, eg > 0, and el"' > are arbitrarily small constants 

J (norm) (norm) (u) (u) , (w) . . a j t (norm) ^ „ (norm) ^ r\ M -^ r\ 

and €2 ,64 , €2 , e\ , and £3 are constant dependent on e^ > 0, eg > 0, e\ > 0, 

eg > 0, and e^^^ > 0, respectively, but independent of n. 

Now, we return to the probabilistic analysis of (166). Following the arguments between (41) and (44) as 
well as those between (58) and (61) (and additionally combining all of them with those between (58) and 
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(64) in [62]) one obtains the "signed" analogue to (61) 



P( min ^ (||A 

.{io" "■ 
obj 



[w^'xrG5+(<7,x,Cw,/ii°7'-^) 



a 



I2) > CS+"^)(1 - e-^^-") > (1 - e-^'°--")(l - e-^cw-) 



(180) 



where 



d -r^ = (1 - Q.p)i^e+(^, g, h, X, /il°7'-)) - e^t^V^ - ei'^V^, (181) 



e^ower is a constant independent of n, and e^ , e^^ are arbitrarily small constants. Finally we are in position 
to summarize the above results in the following lemma. 

Lemma 13. Let v be an n x 1 vector ofi.i.d. zero-mean variance a"^ Gaussian random variables and let 
A be an m X n matrix ofi.i.d. standard normal random variables. Consider an x defined in (6) and a y 
defined in (3) for x = x. Let then Cobj+ be as defined in (164) and let w be the solution of (164). Assume 
^(||w||2 < Cw) > 1 — e~'^^'^^for an arbitrarily large constant C^ and a constant ec^ > dependent on 
Cw but independent ofn. Then there is a constant eiower > 

PiCobj^ > dgr^) > (1 - e-^'"™-")!! - e-^^w"), (182) 

where 

dgr^ = (1 - Q.p)i?e+(^,g,h,i,/igr)) - e^^^V^-e['^V^, (183) 

^-1- [a, g, h, X, f^^^^ ) is as defined in (167) (and can be computed through (1 72) and (1 73)), and enp, e\ , ef 
are all arbitrarily small positive constants. 

Proof. Follows from the discussion above and the one presented in Section 2.2.2. D 

The above lemma achieves one of the goals established right after (153). Namely, for a f^^^^ it 

establishes a high probability lower bound QbiT^ '^^ Cobj+- As we stated earlier, if one can find f^^^'^ 

such that Cobi4^^ ^ '>^socp+ then f^^^^ would be a high probability lower bound on fobj+- Moreover, 

one may hope that f^f!!^^ ~ fobj+'^ ^^^ ^^^ C'w„p+ for which this would happen is such that Cw„p+ ~ 
||wsocp+||2- We establish all of this in the following section. 

3.3 Matching upper and lower bounds 

In this section we specialize the general bounds f^^^"^ and f]^^^^ introduced above and show how they 
can match each other. As in Section 2.3, we will divide presentation in several subsections. In the first of 
the subsections we will make a connection to the noiseless case and show how one can then remove the 
constraint from (175), (176), and (177). In the second and third subsection we will specialize the upper and 
lower bounds on fobj+ computed in Sections 3.1 and 3.2 and show that they can match each other. In the 
fourth subsection we will quantify how much the lower bound on Cofej+ that can be computed through the 
framework presented in Section 3.2 for a "suboptimal" w deviates from the optimal one obtained for w+. 
In the last subsection we will connect all the pieces and draw conclusions regarding the consequences that 
their a combination leaves on several SOCP parameters. 
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3.3.1 Connection to the "signed" £i optimization 

Before proceeding further with the core arguments we in this subsection estabUsh a technically helpful 
connection between the constraint in (175), (176), and (177) and the "signed" fundamental performance 
characterization of ii optimization derived in [64] (and of course earlier in the context of neighborly poly- 
topes/simplices in [28]). What we present here is exactly the same as what was presented in the correspond- 
ing section in [62] and of course structurally analogous to what was presented in Section 2.3.1. However, 
since the analysis that we will present below will be reusing it repeatedly we include it here again. We first 
recall on the condition from Lemma 1 1 . The condition states 



Cj2 



'l + -^||h + i/+zW-A(2+)||2< ||g||2, (184) 



where C^ is an arbitrarily large constant and z/+ and A(2+) are the solution of 



max 



C7^||g||2-||h + Z/z(l)-A(2)||2- Y, \ ^, 

i=n—k+l 



(2)^. 



(2) 

subject to Aj > 0, 1 < i < n 

u>0. (185) 

Now we note the following equivalent to (185) for the case when nonzero components of x are infinite 



max 



aJ||g||2-||h + Z.z(l)-A(2)||2 



(2) 

subject to X] ' > 0,1 < i <n — k 

Xf^ =0^n-k + l<i<n 

zv > 0. (186) 

To make the new observations easily comparable to the corresponding ones from [63,65] we set 

h = [h(i),h|.2p . . . ,h^^_^^,h„_;c_|.l,h„_fc_(_2) • • • jhn] , (187) 

where wJ, hU, • • • , h/^ ^s] are elements of [hi, h2, . . . , h„_fc] sorted in increasing order (possible ties in 
the sorting process are of course broken arbitrarily). Also we let z*^2) be such that z- = —z- ,n — k + l < 
i < n and z^^ ' = z^^ , 1 < i < n — fc. It is then relatively easy to see that the above optimization problem 
is equivalent to 



max (T-v/llglll — ||h+ — zvz(2) + A(2)||2 

(2) 

subject to Xl > 0, 1 < i < n — k 

Af ^ =0,n-k + l<i<n 

u>0. (188) 
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Let U£-^^ and A'-^^^^ be the solution of the above maximization. Further, consider the following "signed" 
version of the ii optimization from (2) 

min ||x||i 

X 

subject to ^x = y 

Xi > 0, 1 < i < n. (189) 

Then, as we showed in [65] and [64], the inequality 

^||g||2 > ^||h+ - z^£,+z(2) + a(^i+)||2 (190) 

establishes the following "signed" fundamental performance characterization of the ii optimization algo- 
rithm from (189) that could be used instead of SOCP to recover "signed" x in (1) (which is a noiseless 
version of (3)) 

/-r -(erfinv(2 ^~"T -i))^ 

(1 - P:;)-^^^ + V2erfinv(2— ^ - 1) = 0, (191) 

aw I - Pw 

where of course a^ = — and /3+ = -. As it is also shown in [65] and [64] both of the quantities under the 
expected values in (190) nicely concentrate. Then with overwhelming probability one has that for any pair 
(q, /5) that satisfies (or lies below) the above fundamental performance characterization of ii optimization 

||g||2 > ||h+ - i/^,+z(2) + A(^i+)||2. (192) 

(2+) 

Moreover, since A^^ >0,n — k + l<i<n, (and of course by the signed assumption Xj > 0, 1 < i < n) 
in (185) one actually has that (192) implies 

||g||2 > ||h + zAzW -A(2+)||2, (193) 

which for sufficiently large Cw is the same as (184). We then in what follows assume that pair (a, /3) is such 
that it satisfies the fundamental ii optimization performance characterization from (191) (or is in the region 
below it) and therefore proceed by ignoring the condition (184). 

3.3.2 Optimizing /otj+'s upper bound 

In this section we will lower the value of the upper bound created in Section 3.1 as much as we can by a 
particular choice of (7w„p+. Let ^duai+{(^, g, h, x, rsocp+) be 



?d«a/+(c^'g>h,x,rsocp+) = minmax ^/d?+a^\\g\\2ly - d\\uh + z^^^ - X^'^^y - V] Xf^Xi - ursocp+ 

j=n— fe+1 

subject to u > 

Af ^ > 0, 1 < i < n. (194) 
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Rewriting (194) with a simple sign flipping we obtain 

n 

. ,,, , rn >('9^,, v^ , f2U 



-■^dMa«+(o-, g,h,x, r5ocp+) = max min - \/(P+a^\\%\\2V + d\\vh. + tP""^ - X^'^\\2 + \] K^i + 

i=n— K+1 



subject to V >{) 

Xf^ > 0, 1 < i < n. (195) 

The following lemma provides a powerful tool to deal with (195) and is a "signed" analogue to Lemma 8. 
Lemma 14. Let S,duai+io', g, h, x, rsocp+) be as defined in (195). Further, let 

n 

-Cp«m+(o-, g, h,x, rsocp+) = min max - \/lp + o^\\g,\\2V + d\\vh. + t!''^^ - X^'^'^ \\2 + V] xf' y.i + vr socp+ 

i=n—k+l 

subject to z^ > 

Xf^ > 0, 1 < i < n. (196) 

■^dna«+('7,g,h,X,rsocp+) = Cprim+(o-, g, h, X, rsocp+)- (19V) 



Proof. The proof is literally the same as the proof of Lemma 8. The only difference between optimiza- 
tion problems (195) and (196) and the corresponding ones (76) and (77) from Section 2.3.2 is the set of 
constraints on A*-^^. This difference does not affect substantially the structure of the proof of Lemma 8. D 

- — - 12,) (2) 

Let (i+, i'up+, ^up+ ^^ ^^ solution of (194) (or alternatively let Uup+, A„ _,_ be the solution of (196). 



\\i', — -Vi -u^{l) x(^) II 
^ = , ^ IK+h + zi)-A,^Jb ^^^3^ 



Clearly, 



|o-||2,; :2 _ II; -u , (1) _ \(2+)||2 

|&|l2'^Mp+ ll'^np+i^ T^ ■^^ ' ^^ 'II2 

As shown in Section 3.1 all quantities of interest concentrate and one has 

Ed+ = a ^ "^+ ^ ^— , (199) 

' E\\g\\lEi7;:;+^ - E\\i^+h + z(i) - A(2^) ||2 



where as earlier = indicates that the equality is not exact but can be made through the concentrations as 
close to it as needed. Now, set Cw„p+ = Ed^ in (154). Then a combination of (154), (194), and Lemma 14 
gives 

ECup+{cT,g,h,5l,rsocp+,Ed+) = E max (J (Ed^y + a^g\\2ly-Ed+\\uh+z^^^-X'^^y)\\2- V A^x 

A(2)eA(2+),i/>0 . , , 

~ i=n— fe+l 

n 

= E'min max {^/(P~+^\\g\\2ly-d\\uh+z^'^'^-X^'^^)\\2- V] Xl'xi-ursocp+) = E^prim+{cr,g,i^,i,rsocp+) 

d>0 A(2)eA(2+),jy>0 . , , , 

~ i=n—k+l 



(200) 
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Moreover, in a fashion similar to the one from Section 2.3.2 one has 



i=n—k+l 

_ (201) 

where (AJ^ _,_)j is the i-th component of X^p. 

Let w^^ be the solution of (152). Then i!^||w\^||2 = C'w^p^ = Ed^ and with overwhelming 
probability /obj+ < f^^l < SCprim+(o-, g, h,x, rsocp+) + e/ip|^'^prim+(o-,g, h,x,rsocp+)| for an 
arbitrarily small positive constant eup (Ed^ is of course as defined in (199)). In the following section 

we will show that with overwhelming probability fobj+ > fobjl > ES.prim+{cr,g,'^,^,rsocp+) - 
Qjp|^'?prjm+(c, g, h, x,rsocp+)| which will establish £'^prim+(c, g, h,x) as the concentrating point of 
fobj+- Moreover, we will show that if vif5ocp+ is suchthat ii^llw^ocp+lb substantially deviates from £'||wu^||2 
then fobj+ would substantially deviate from E^prim+{(^, gj h, x, rsocp+) which will establish iiJ||wJ^||2 = 
Cw„p+ = Ed~^ as the concentrating point of ||wsocp+||2- 

3.3.3 Specializing fobj+^s lower-bound 

In this section we finally determine the concentrating point of fobj+- The results are completely analogous 
to those from Section 2.3.3. We will just quickly restate them without going through the details again. Let 



(202) 



where e^^^^ , > is an arbitrarily small but fixed constant. From (172) one then has 



U+(cx,g,h,x,/^^-) = max a^||g||i-||h + z.za)-A(2)||i- J] xP^^-udT^ 



subject to u > 

Af ^ > 0,1 <i<n. (203) 

Let us choose u = J=^ and A^^^ = ^S£i in the objective function of the above optimization. Since this 
choice is suboptimal and since all the quantities concentrate (202) would imply 

i?U+(^,g,h,i,/XT) > (l + er_,+ )r,o,p+. (204) 

On the other hand based on a combination of the arguments from Section 3.3.1 and (204) one would also 
have 

E^+{a, g, h, i, /ig7^)) = EU+(.<T, g, h, X, /^^J-) > (1 + er,^^^^)rsocp+. (205) 

Finally a combination of (205) and Lemma 13 would give 

P(.Cobj+ > (1 + er.„.,Jr,oep+) > 1 - e-^'--". (206) 

However, this would, in a statistical sense, contradict the setup of (147). Therefore out assumption that 
fob'hh^ satisfies (202) is with overwhelming probability unsustainable. A combination of (206), (200), 
(201), results from Lemma 10, and the discussion right after Lemma 13 imply that fobj+ concentrates 
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around E^prim+{cr, g, h, x, rsocp+)- 



3.3.4 II y^socp+ 1 1 2 's deviation from 1 1 w 



up^ 



In this subsection we will show that Hw^ocp+lb can not deviate substantially from ||wj^||2 without sub- 
stantially affecting the value of the lower bound on the objective in (147) that is derived in Section 3.2 
(or ultimately the one from Section 3.3.4). Let us assume that there is a ^off+ such that 'x.socp+ = 
x + Wo/j+, where obviously Xsocp+ is the solution of (147) or (146). Further, let |||wo/j+||2 — ||vvj^||2| > 
ew„p+ ||wu^||2, where ew„p+ is an arbitrarily small constant. 

One can then proceed by repeating the same line of thought as in Section 3.2. The only difference will 
be that now (7w = ||wojj_|_||2 and consequently in the definition of S'^(cr, x, Cw, /ob"^^*^ ), ||w||2 < Cw 
changes to ||w||2 = Cw = ||wojj+||2. This difference will of course not affect the concept presented in 
Section 3.2. The only real consequence will be the change of (168). Adapted to the new scenario (168) 
becomes 



?o//+(c^>g>h,x,||wo//+||2) =min y ||wo//+||| + o-2||g||2 + ^ h. 






subject to ||x + vi^||2 - ||x||i < S^prim+(o-,g,h,x,rsocp+) 

\/||w||2 + a2 < ^||w,^^+||2 + a2. (207) 

Following step by step the derivation after the definition of ^off in Section 2.3.4 one obtains the following 
"signed" analogue to (104) 

Eioff+{'y, g, h, X, ||w,^j+ II2) - EU+ (cT, g, h, X, /(J^r^)) > .^ "^"^+ E^E+, (208) 



iobj+ ) - 2(1 + e^ 



up-f 



where iE^ = o-^ (^llglb)^ - (^||h + J^z(i) - A(2+)||2)2. As shown in Section 3.3.3 if /^J,"^''-' = 

E^prim+icr, g, h, X, rsocp+) then E^ov+{cr, g, h, x, fH"^'"'') > rsocp+- Knowing that, (208) basically shows 
that if ||wsocp+||2 were to deviate from ||'Wt^||2 the optimal value of the objective in (163) would concen- 
trate around a point that is non-trivially higher than rsocp+ (note that E^e+ ~ \/^)- This again contradicts 
the setup of (147) and makes our deviating assumption unsustainable with overwhelming probability. Hence 
Wsocp+ is such that ||wsocp+||2 concentrates around £'||-vvj^||2 with overwhelming probability. 

3.4 Connecting all pieces 

In this section we connect all of the above. We will summarize the results obtained so far in the following 
theorem. 

Theorem 4 (Nonzero elements of x a priori known to be of certain sign). Let ^ be an n x 1 vector of 
i.i.d. zero-mean variance 0"2 Gaussian random variables and let Abe an m x n matrix of i.i.d. standard 
normal random variables. Further, let g and h be m x 1 and n x 1 vectors of i.i.d. standard normals, 
respectively. Consider a k-sparse x defined in (6) and a y defined in (3) for x = x. Let the solution of 
(146) be Xsocp+ cind let the so-called error vector of the SO CP from (146) be 'Wsocp+ = '>^socp+ — x. Let 
fsocp+ in (146) be a positive scalar Let n be large and let constants a = — and /3+ = - be below the 
"signed" fundamental characterization (191). Furthermore, let x, a, (3^, a, and rsocp+ be such that (147) 
is feasible with overwhelming probability and E^prirn+{<^-, g, h, x, rsocp+) defined below is finite. Consider 
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the following optimization problem: 

n 



■^prim+Co-jg, h,x, rsocp+) = max CT-y/ ||g|||zy2 _ y^yh + z(i) - A(2)||2 _ V^ A- ^Xj 

J=?l— fe+1 



subject to 1/ > 



Af ^ > 0, 1 < i < n. (209) 



Le? i'up+ and X^J,^ be the solution of (209). Set 

||w„p+||2 = a ^ ^ ■ (210) 

-l/llo-l|2,7 -2 u- — ~ Vi I „(l) \(2) ||2 

Y l|g|l2^MP+ ~ iFup+n + zv ; - A„p^_||2 
Then: 

P(||x + W^ocp+lll - ||x||i G (£'Cprim+(o-,g,h,X,rsocp+) - ei *'^|-E'^prim+(cr,g, h,X,rsocp+)l; 

-E^?prim+(o-,g, h,X,rsocp+) + ei ^|-E^prim+(o-, g, h,x,rsocp+)|)) = 1 -e"'2 " (211) 



and 



p,,-. ,(socp)n pii -r- — ~|| ^ ||„, II ^ n _i_ ^(^"'^Phpiu-i;^ — ~-ii ^ — 1 „-4''°''^^« nn^ 

-'^iU — Ci j-C/||Wup+||2 < ||Wsocp+||2 S (,i + Ci j-C/||Wup+||2J — i — e 2 ^ (^1^) 



where ^^^"'^'''' > /i' aw arbitrarily small constant and e^'^^' is a constant dependent on e^"'^^' and a but 
independent of n. 

Proof. Follows from the above discussion and a combination of (172), discussions in Section 3.3.1 and 
those after (206) and (208), and Lemmas 10 and 13. D 

The above theorem is the "signed" analogue of Theorem 1 and as such is as powerful a tool as Theorem 
1 itself. As we have done in Section 2 we will below again focus only on, what we will call, SOCP's 
generic performance scenario. We will defer to forthcoming papers consideration of other scenarios as well 
as computation of their relevant performance characterization parameters. 

3.4.1 Signed SOCP's generic performance 

In this section we focus on the "generic performance" scenario for the SOCP from (4). We will again 
consider a simplification of (209) that among other things enables one to find a particular "generic" choice 
of rsocp+ for which E'||wu^||2 from Theorem 4 can be upper-bounded over a large range of x's. As in 
Section 2.4.1, let us now assume that all nonzero components of x in (3) are infinite. Then, clearly, the 
optimization problem from (209) becomes 



irim+i^^ S, h, rsocp+) = max a J \\g\\lu'^ - \\uh + zW - A(2) 
subject to u > 



p,i^^[cr,g,n,rsocp+) = max a^^ WgW^i^^ - \\u]a + z^'' - A^^'W^ - vrsocp+ 



< xf^ =o,n-A; + l<i<n 

\f^ > 0, 1 < i < n - fe. (213) 
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Let i'gen+ and A*^^*^"^^ be the solution of (213) and let Wgen+ be the error vector in case when all nonzero 

components of X are infinite. Clearly, ^p^^^^_^(o-,g,h,rsocp+) < ^prim+icr,g,i^,i,rsocp+)- Then the fol- 
lowing generic equivalent to Theorem 4 can be established. 

Theorem 5. Assume the setup of Theorem 4. Consider the following optimization problem: 



Vim+('^'S'^'^«ocp+) = max aJWgWlu"^ - ||z/h + z(i) - A(2)||2 - ursocp+ 

subject to u >Q 

Af ^ =0,n-/c + l<i<n 

Xf^ > 0, 1 < i < n - fe. (214) 

Let ygen+ and X(9^^^~^) be the solution of (214). Set 

||Wgen+||2 = O" , (215) 

Then: 

(socp) 



P{mm{^prim+icr, g, h, x, rsocp+)) G (-E^pr jm+ (o", g, h, rsocp+) - e\ '\E^prim+icr, g, h, rsocp+)\, 

X 

( ^om) (socp) 

E^prim+{<y-, g, h, rsocp+) + A ^'\E^prim+{(T, g, h, rsocp+)\)) = 1 " e"'2 " (216) 



P{^^socp+\\Wsocp+h e ((1 - ei'"''^)^l|Wgen+||2, (1 + ei'°'^))^||w<,en+ lb)) > 1 " g-^^"'''", (217) 

where ^^^"'^'^' > /i' a« arbitrarily small constant and e^^^' is a constant dependent on e^"'^^' and a but 
independent of n. 

Proof. Follows from the above discussion and Theorem 4. D 

3.4.2 Optimal rsocp+ for the generic scenario 

In this section we design a particular choice of rsocp+ that enables favorable performance of (146) as far as 
the norm-2 of the error vector of (146) is concerned. To that end let us slightly change the objective of (214) 
in the following way 



1 



Crim+(^'g'h,r,ocp+) = max -{(J J Ml - \\h+uzW - A(2)||2 
subject to V > 



Af ) = 0^n-k + l<i<n 

Af ^ >0,l<i<n-k. (218) 
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Repeating the arguments between (186) and (188) one has that the following is equivalent to (218) 

4rim+(^'S>h,rsocp+) = max - (a J \\g\\l - \\h+ - z^z(2) + A(2)||2 - rsocp+) 

subject to z/ > 

Xf^ =o^n-k+l<i<n 

Xf^ >0,l<i<n-k. (219) 



Set 



(opt) 



TsocU = <r^J{E\\ghy - E{\\h+ - u^.+zi^) + A(^i+)||2)2, (220) 

where i/^-^^ and A^^^^^ are as defined in Section 3.3.1. Using further the arguments from Section 3.3.1 we 
have 

riT{!+ = o-y (a - at)n, (221) 

where a+ is as defined in the "signed" fundamental characterization (191). Let Wg°^:j_ be Wgen+ in Theorem 
5 obtained for rsocp+ = ^socp+- Then repeating the line of arguments between (118) and (121) one has 



-^11'* aen 11^ " / , _,, — r, — rr; , _ , , - ttv: 



{EUh? - E\\h. + ^z(i) - A(£!!!±l||2 



i^gen-\- 



Since both ||w^g^:{_||2 and ||wge„_|_||2 concentrate one also has 

P{\\^^;inl\\2 < llWgen+lb) > 1 " g-^-^^"-, (222) 

where ewge„ > is a constant independent of n. (222) shows that if rsocp+ / '''socp+ "^^en with overwhelm- 
ing probability there will be a solution to the SOCP from (146), Wsocp+, such that ||wsocp+||2 > llw^X+lb- 
Now let us look at general x and the corresponding optimization problem (209). Now let rsocp+ = 

;i05) obtained for rsocp+ = r^foll 
repeating the line of arguments between (121) and (122) one has 

^llh + JUzW-^ 



'>^s'ocp+ ^^ (209). Further, let i\^ and A^^^ be the solution of (105) obtained for rsocp+ = ^soc»+- Then 

i(i: 

X(2) 

£i|| -r 

E^ll II "11^+" !^up+ " ^ / Oiw j-,|| (opt) II /-000\ 

Since all random quantities discussed above concentrate we have the following lemma. 
Theorem 6. Assume the setup of Theorem 4. Let rsocp+ in (146) be 



rsocp+ = nocp+ = (T\J{oi- at)n. (224) 
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Then 



socp4 " " ' 



P{\\^socp+h<^\ ^)>l-e-^i ", (225) 



a — a,, 



where e\ """^ > is a constant independent of n and q^ is as defined in fundamental characterization 
(191). Moreover, ifrsocp+ in (146) is such that 

rsocp+ > rll^p+ = a^{a- at)n, (226) 

then 



P{3^socp+\\\^socp+h > <y\ ^)) > 1 - e-^2 ". (227) 

y a — aZi 

where €3 """^ > is a constant independent ofn. 

Proof. Follows from the discussion presented above, Theorem 4, and the discussion presented in Section 

2.4.2. D 

Remark: Since we assumed the setup of Theorem 4 there will be a potential restriction on pairs (a, /3^) 
that goes beyond being below the standard "signed" fundamental characterization (191). We do, however, 

mention that for rsocp+ > rs°fcp+ = cry (« — ctt))n such a restriction is not necessary in the "generic" 

scenario, i.e. if rsocp+ is as in (226) E^z.'^^^{a, g, h, rsocp+) will be finite and (147) will be feasible with 
overwhelming probability. This fact is rather obvious but we mention it for the completeness. 

3.4.3 Computing E \ \ Wgen+ \ \ 2 and E^^^-^^ (o", g, h, rsocp+ ) 

In this section we present a framework to compute ||wgen+||2 and Cprim+('^' S' ^^ ''"socp+) or more precisely 

their concentrating points £^||wgen+||2 and E^z.fJ^_^_{a, g, h, rsocp+)- AH other parameters such as fgen+, 

X \^^ can be computed through the framework as well. As in Section 2.4.3 we below do assume a familiarity 
with the techniques introduced in our earlier papers [62, 65]. To shorten the exposition we will then skip 
many details presented in those papers. 

We start by looking at the following optimization problem from (213) 



4rim+(^'g'h,rsocp+) = max cr J \\g\\lu'^ - \\iyh + zW - A(2)||2 






subject to u > 

Af ^ =0^n-k + l<i<n 

Af ^ >0,l<i<n-k. (228) 

Using the definitions of h+ and z^^^ from Section 3.3.1 we modify the above problem in the following way. 



^S+(^'g'h,r^ocp+) = max aJ\\g\\lu^ - \\iyh+ - z(2) + A(2))||2 - iyrsocp+ 

subject to V > 

Xf^ =0,n-/c + l<i<n 

Af ^ > 0, 1 < i < n - A:. (229) 
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Now, let A^^'^"^^ be the solution of the above optimization (as in Section 2.4.3, this is a slight abuse of 
notation since due to the above restructuring of h this A(3^"+) is different from the one in the above 
Theorem). Following what was presented in [65] there will be a parameter Cgen+ such that A^^^""'") = 
j^we" ) ^ A2 "^^ I- ■ ■ ■, ^c^Jn+ ) 0, 0, . . . , 0] and obviously Cgen+ < n — k. At this point let us assume that 
this parameter is known and fixed. Then following [65] the above optimization becomes 



u 



max a^ ||g||2r/2 _ \\uh+^^^_^^.^^ - zf^^^^^^Jg - ursocp+ 



subject to 1/ > 0. (230) 

Mimicking what was done in Section 2.4.3 we set 

ll„l|2 _ ||i;+ 112 

_ II&II2 Il"cgen++l:™ll2 



^socp+ 



Jgen+ 



(U+ )T (2) 

'^socp+ 



(231) 



and obtain the following equation that can be used to determine Cgen+ (as in Section 2.4.3, Cgen+ is the 
largest natural number such that the left-hand side of the equation below is less than 1 and the term that 
multiplies h^^^^ , is nonnegative; as in Section 2.4.3, to make writing and exposition easier we instead of 
"less than 1" write "equal to 1" and adequately all other inequalities replace by equalities). 

+ ^~(Qgen+^gen+ " (hcgen++l:rt) '^Cgen+ + l-n) 



h+ { 

!"="+ o2 _ II IP I ||u 

"oen+ II&II2 + II" 



V«+ 11^112 T Il"cgen++l:nll2 



u2 +\\J-^^ l|2 

{agen+bgen+ {^Cg^„++l:n} ^Cg,^++l:n) {al^^+~U\\l + \\K^^^^+^.J\l)- 



^2 llo'l|2 4- IIVi"'" ||2 

"'gen+ II&II2 ^ Il"cgen+ + l;nll2 



1. (232) 



Let Cgen+ be the solution of (232). Then 



T„(2) 



{agen+bgen+ \\g^„++l:n) '^Cgen++l.n) 

n'^ llcrl|2 _|_ IIVi"'" ||2 

"'gen+ II&II2 ^ Il"cgen++l:nll2 






''gen+^gen+ V"cj,en++l:n^ ^Cg^^^ + \:n) 1^2 _||„||2, ||h+ , , l|2)-i 



,i2 llcrl|2 4- IIVi"'" ||2 

^gen+ II&II2 ^ Il"c„e„+ + l;nll2 



. (233) 



From (215) one then has 



||w<;en+||2 = (y , /ON • (^34) 

/ll l|2 2 II u+ (2j ||2 

Y I|g|l2'^sen+ - ll^9en+ncgen++l:« ~ ^Cg<jn++l:nll2 

Proceeding as in Section 2.4.3 one can then determine the expectations 

^l|g||2,-E^I|h+ +l:„ll2,^((hL„++l:„) Zc„,„+ + l:n)- (235) 
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Clearly, 

E\\gg = m. (236) 

Let Cgen+ = {I — 6~^)n where 6*+ is a constant independent of n. Then as shown in [65] 

n^oo n ^2^ I (erfinv(2i^-i))2 I "" 

where we of course recall that /3+ = -. Also, as shown in [65] 

n-5>oo n \ V 27r y 

The only other thing that we will need to compute Cgen+ (besides the expectations from (235)) is the follow- 
ing inequality related to the behavior of h+ . Again, as shown in [65] 

P(^/2erfinv((l + e>"+)(2i:i^ - 1)) < h,^^„J < e^'^'™^", (239) 

where g^ '"»'="+ > is an arbitrarily small constant and g^"^"""^ is a constant dependent on g^''»'""+ but 
independent of n. 

At this point we have all the necessary ingredients to determine Cgen+ and consequently Ugen+ and 
||wgen+||2- The following corollary then provides a systematic way of doing so. 

Corollary 2. Assume the setup of Theorems 4 and 5. Let h+ be as defined in (187) and let rs^c.p+ ~ 
lim„_j.oo ''°°^^ ■ Let a = "^ and /3jj = - be fixed. Consider the following 

+ ( V2{erfinv{2^-1))\ 

A+(n = lim%?±=a ^ - , T^ » y ^^o-^+T) 

n-^oo ^ («c) (sc) 

' socp+ socp+ 



- - n->oo yjn ^(*c) (sc) 

socp+ socp+ 

F+{9+) = ^erfinv{2\^^-l), (240) 

f Pui 



where 



\TJ2) 



n^oo n \ V 27r / 

E\\ht ,, _^, 111 1 «+ /V2(er/?w(2i^-l))\ 

\ e 1"'^™ / 

(241) 
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F+(0^ 



Let 6^ be the solution of 

-{A+{e+)B+{e+) - c+{e+)) - ^{A+{e+)B+{e+) - c+{e+)Y - {B+je+y + e+){A+{0+Y - a + d+{9+)) 

A+{e+Y-a + D+{e+) 

(242) 
Then concentrating points ofugen+, \\'Wgen\\2, ^nd Wim+('^' §> -h) ^^ Theorem 5 can be determined as 



^^gen+ 



(A+{e+)B+{e+) - c+(e+)) - \{A+{e+)B+{e+) - c+{e+)Y - {B+{e+Y + e+){A+{e+Y - q + D+{e+)) 



A+{e+Y-a + D+{e- 



EWV'fap.-nWl — <y 



(yen II 2 



{Evg,n+YD+{e+) - 2Eug,n+C+{e+) + 9+ 
a{Eug,n+Y - {{Eug,,,+)W+{e+) - 2Eug,n+C+{e+) + 



.{gen) 



^CrJ(^'g'h,r, 



j.^ .pr^m. ,^ ^ ^ aJ a{EUgen+r " ((i^Z^3en+)'^(^+) " 2EUg,^+C{e+) + 0+) - i^Z.,en+rroei+- (243) 

I— >-oo yTl 



Proof. Follows from Theorem 5 and the discussion presented above. D 

The results from the above corollary can be then used to compute parameters of interest in our derivation 
for particular values of /3+, a, a, and rsocp+- Similarly to the case of general x we have conducted massive 
numerical experiments for the case of "signed" x as well. We again observed that the results one obtains 
through the numerical experiments are in a solid agreement with what the presented theory predicts. As we 
have already mentioned, this paper is an introductory presentation of a framework for the analysis of the 
SOCP algorithms and we therefore, as in the case of general x, refrain from a substantial discussion related 
to the results obtained from the numerical experiments. Instead, we will in the next subsection present only 
a small sample of the conducted numerical experiments to demonstrate how precise the presented technique 
actually is. 

3.4.4 Numerical experiments 

Using (142), (143), (144), and (145) one can then for any rsocp+, any a, and any pair (a, /3^) (that is below 
fundamental characterization (72)) determine the value of i^Hw^ocp+lb as well as the concentrating points 
of all other quantities in our derivations. We will organize the presentation of the numerical results as in 
Section 2.4.4. To demonstrate the precision of our technique in the first couple of experiments that we will 
present we ran both SOCP from (4) as well as (127). In some of the later experiment sets though we will 
focus only on the SOCP from (4) whose performance is actually the main topic of this paper. 

1) Random examples from low {a, /5+) regime 

Analogously to what was done in Section 2.4.4 under low (a,/3+) regime we consider pairs (a,/3+) 
that are well below the fundamental characterization (191). We ran 500 times (228) for a = {0.3, 0.5, 0.7}, 
n = 2000, cr = 1, and rsocp+ = V^ = \/cm and various randomly chosen values of /5^. In parallel, we ran 
500 times (4) with the same parameters, except that (4) was run for n = 400. As mentioned in Section 2.4.4 
the non-zero components of x can not really be made infinite. We instead again set them to be ^ when 

generating (3). The results we obtained for Evgen+, S^p^^^+(o-,g, h,rsocp+), ^Hw^en+lb, Efobj+, and 
-E'llw^ocp+lb through these experiments are presented in Table 6. As in Section 2.4.4 the theoretical values 
for any of these quantities in any of the simulated scenarios are given in parallel as bolded numbers. We 
observe a solid agreement between the theoretical predictions and the results obtained through numerical 
experiments. 
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Table 6: Experimental/theoretical results for the noisy recovery through SOCP; Vsocp^ 
(146) was run 500 times with n = 400; (228) was run 500 times with n = 2000 



m, a 



1; 



a 


l3+/a 


EVgen+ 




E\\'Wgen+\\2 


Efobj + 


^llw^ocp+lb 


0.3 


0.15 


0.6488/0.6484 


0.1220/0.1228 


1.1532/1.1561 


0.1235/0.1228 


1.1805/1.1561 


0.3 


0.2 


0.7067/0.7044 


0.1721/0.1713 


1.5070/1.4948 


0.1763/0.1713 


1.5358/1.4948 


0.3 


0.3 


0.8383/0.8333 


0.3014/0.2962 


2.8777/2.6681 


0.3004/0.2962 


2.8709/2.6681 


0.5 


0.3 


0.8948/0.8942 


0.3308/0.3312 


1.8561/1.8471 


0.3307/0.3312 


1.8623/1.8471 


0.5 


0.35 


0.9714/0.9680 


0.4124/0.4099 


2.3237/2.2831 


0.4117/0.4099 


2.2945/2.2831 


0.5 


0.4 


1.0595/1.0557 


0.5060/0.5037 


3.0084/2.9080 


0.4664/0.5037 


3.0190/2.9080 


0.7 


0.45 


1.1883/1.1844 


0.6419/0.6392 


2.6716/2.6333 


0.6477/0.6392 


2.6828/2.6333 


0.7 


0.5 


1.3008/1.2935 


0.7691/0.7619 


3.3183/3.2275 


0.7649/0.7619 


3.2377/3.2275 


0.7 


0.55 


1.4524/1.4304 


0.9364/0.9129 


4.3821/4.0960 


0.9339/0.9129 


4.2468/4.0960 



Table 7: Experimental/theoretical results for the noisy recovery through SOCP; Tgocv^ 
(146) was run 200 times with n = 400; (228) was run 500 times with n = 5000 



V0.2m, 



o 



1; 



a 


/3+/a 


El^gen+ 


E4--+(l'g'hV0.2m) 


^||Wgen+||2 


Efot,+ 


E\\Wsocp+h 


0.3 


0.286 


1.0438/1.0425 


0.0021/0 


2.0460/2 


0.0042/0 


2.0417/2 


0.5 


0.3842 


1.5355/1.5346 


0.0029/0 


2.0319/2 


0.0052/0 


2.0061/2 


0.7 


0.4849 


2.3506/2.3301 


0.0020/0 


2.0257/2 


0.0179/0 


2.0169/2 



2) Specific examples in low (a, /?+) regime 



a) r. 



socp+ 



socp+ 



a 



Y (a - at)n 



We also ran a carefully designed set of experiments intended to show a specific behavior of the SOCP 
from (4) and the above theoretical predictions. For a pair (a,/3^) instead of choosing rsocp+ as ^/m = 

fom, we chose r,s„„,+ = (y\ {a — aw)n, where a+ is the one that corresponds to /3+ in the fundamental 



' socp-\ 

characterization (191). As discussed in [62] this choice of rsocp+ should make the norm-2 of the error vector 
in (146) no worse (larger) than the one that can be obtained via a couple of LASSO algorithms considered 
in [62]. We then considered the contour LASSO line from [62] that corresponds to the norm-2 of the error 
vector equal to 2 and from that line we chose three pairs (a, /?+) (see Table 7) for which we then ran (146) 
(the LASSO contour lines obtained for "signed" x in [62] are shown again in Figure 5; in fact, as mentioned 
in Section 2.4.4 and as argued in [62], with rsocp+ as above the performance of SOCP from (146) can also 
be characterized by these lines, i.e. one may as well refer to them as the "signed" SOCP contour lines!). As 
usual, to make scaling simpler we set a = 1. Based on results of [62] and those from Section 3.4.2 it is then 
easy to see that rsocp+ = V0.2m. We ran (146) 200 times with n = 400. We also in parallel for the same 
set of parameters ran (228). To get a bit better concentration results we ran (228) 500 times with n = 5000. 
Obtained results are presented in Table 7. The theoretical values for any of the simulated quantities in any of 
the simulated scenarios are again given in parallel as bolded numbers. We again observe a solid agreement 
between the theoretical predictions and the results obtained through numerical experiments. 

b) Varying rsocp+ from \/0.2m to y/m 

To observe how the values of the norm of the error vector change with a change in rsocp+ we conducted 
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Table 8: Experimental/theoretical results for the noisy recovery through SOCP; rsocp+ 

{\/0.2m, \/0.6m, ^/m}, o" = 1; (4) was run 200 times with n = 400 



a 


Pn,/a 


1"socp+ 


= V0.2m 


'f^socp+ — 


V0.6m 


'f^socp+ 


= y/m 


Efobj+ 


-S Wsocp+ 2 


Efob,+ 


E\\Wsocp+\\2 


Efob,+ 


E\\Wsocp+h 


0.3 


0.286 


0.0042/0 


2.0417/2 


0.1654/0.1712 


2.1987/2.1656 


0.2791/0.2753 


2.4746/2.4244 


0.5 


0.3842 


0.0052/0 


2.0061/2 


0.2883/0.3007 


2.2630/2.2902 


0.4640/0.4720 


2.6581/2.6815 


0.7 


0.4849 


0.0179/0 


2.0169/2 


0.4762/0.4728 


2.5097/2.4818 


0.7207/0.7224 


3.0121/3.0263 



Table 9: Experimental/theoretical results for the noisy recovery through SOCP; rsocp+ = x/O.lm, a = 1; 
(146) was run 200 times with n = 2000; (228) was run 200 times with n = 10000 



a 


Pw/a 


El'gen+ 


<™+(l-g-hy0.1m) 


E\\'Wgen+\\2 


Efobj+ 


-E Wsocp+ 2 


0.3 


0.3423 


1.1231/1.220 


0.0019/0 


3.1321/3 


-0.0476/0 


3.1986/3 


0.5 


0.4672 


1.7442/1.7369 


-0.0007/0 


3.0414/3 


0.0053/0 


3.1050/3 


0.7 


0.5971 


2.9448/2.8817 


-0.0066/0 


3.0161/3 


0.0066/0 


3.0288/3 



a set of experiments where we chose the same three pairs {a, P^ 
varied r. 



as in the previous set of experiments but 



SOCP+- We varied rsocp+ over set {\/0.2r7T,, \/0.6m, y/m}. We focused only on SOCP and ran (4) 
200 times with n = 400. The obtained results are presented in Table 8. Again, the theoretical predictions are 
given in parallel in bold. The results obtained through numerical experiments are again in a solid agreement 
with the theoretical predictions. Also, one can see that as rsocp+ decreases from y/m to \/0.2m,, E \ \ v^socp+ \ \ 2 
decreases as well. 

2) Specific examples in high (a, /?+) regime 



a) r. 



socp+ 



socp+ 



a 



J {a - at)n 



We also ran a carefully designed set of experiments intended to show a specific behavior of the SOCP 
from (146) and the above theoretical predictions in "high" (a, /3^) regime (as in Section 2.4.4 under "high" 
(a, /3^) regime we of course assume pairs of (q, /3^) that are relatively close to the fundamental characteri- 
zation). We again for a pair [a, /?+) instead of choosing rsocp+ as >/m 



^om chose it based on the LASSO 
contour lines. This time, we considered the contour LASSO line from [62] (or Figure 5) that corresponds 
to the norm-2 of the error vector equal to 3 and from that line we chose three pairs (a, /?+) (see Table 9) 
for which we then ran (146). We again set a = 1. Based on results of [62] and those from Section 3.4.2 
we have rsocp+ = VO-lm. To get better concentration results (the pairs of {a, /3^) are now closer to the 
fundamental characterization) we ran (146) 200 times (except the case a = 0.7 which was run 100 times) 
with n = 2000 and in parallel we ran (228) 200 times with n = 10000 for the same set of other parameters. 
Obtained results are presented in Table 9. The theoretical values for any of the simulated quantities in any 
of the simulated scenarios are again given in parallel as bolded numbers. As earlier we observe a solid 
agreement between the theoretical predictions and the results obtained through numerical experiments. 

b) Varying rsocp+ from vCTTm to \/ra 

We also conducted a set of high regime experiments that are analogous to the varying Vgocp^ in the lower 
regime. We maintained the structure of the experiments as in the lower regime. The only thing that was 
different was the way of choosing three pairs (a, P^)- ^^ above, we chose them from the LASSO/SOCP 
contour line that corresponds the norm-2 of the error vector that is equal to 3. Also, as above Vgocp^ = 
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Table 10: Experimental/theoretical results for the noisy recovery through SOCP; rsocp+ 

{VO.lm, V0.5m, ^/rn}, cr = 1; (4) was run 200 times with n = 2000 



a 


Pn,/a 


'^socp+ - 


= VO.lm 


1"socp+ — 


-- \/0.5m 


1"socp+ 


= a/"^ 


Efobj+ 


^ Wsocp+ 2 


Efob,+ 


E\\Wsocp+\\2 


Efobj+ 


E\\Wsocp+\\2 


0.3 


0.3423 


-0.0476/0 


3.1986/3 


0.2206/0.2221 


3.3964/3.3082 


0.3707/0.3725 


3.9132/3.8409 


0.5 


0.4672 


0.0053/0 


3.1050/3 


0.4188/0.4111 


3.7562/3.6109 


0.5678/0.6723 


4.8452/4.4771 


0.7 


0.5971 


0.0066/0 


3.0288/3 


0.5933/0.6893 


3.9797/4.1157 


0.9143/1.0968 


5.0607/5.4164 



■y^" 



at,)n 



ay [a — aw)n = VO.lm (we again for simplicity of scaling assume a = 1). We then varied rsocp+ 

over set {\/0.1m, \/0.5'm, ^/rn} and again focused only on SOCP and ran (146) 200 times (except the case 
a = 0.7 which was run 100 times) with n = 2000. The obtained results are presented in Table 10. The 
theoretical predictions are given in parallel in bold. The results obtained through numerical experiments are 
again in a solid agreement with the theoretical predictions. Also, as it was the case in lower regime, one can 

/m to y/O.lm, E\\wgocp+\\2 decreases as well. 



see again that as rsocp+ decreases from 



4) Signed SOCP contour lines 

As mentioned earlier (and as shown in [62]), for a particular choice of rsocp+ the norm-2 of the error 
vector of the SOCP from (146), ||wsocp+||2» can be made as small as the corresponding norm-2 of the error 

vector of the LASSO algorithms, ||wi(i5so_|_||2, considered in [62]. Namely, for rsocp+ 

has (in a generic scenario) -EHvi^socp+lb = E\\wiasso+\\2 = (^J ""*+ ■ Let p = 
values of p one has the contour lines in (a, /3+) plane below which Hw^ocp+lb is with overwhelming prob- 
ability no larger than a p. Clearly all the contour lines are achieved if the SOCP from (146) is run (for any 



a\/ [a — at))n one 
. Then for different 



(a, /3+) from the contour line) with r 



socp+ 



' socp+ 



rsocp+ip) = ^J 



1+p- 



-n. In Figure 6 we show what 



impact on the contour lines has a change of the optimal rsocp+- For the concreteness, instead of choosing 

o^/om. As can be seen from the plots, as 



^socp+ — 1^socp+ ~ ''^socp+yp) — '^ \ ' 



socp^ 

jfpjn we chose rsocp+ 



'f"socp+ 



increases from a 



ir^n to o^fom the contour lines that guarantee the same p — ±^\\ ysocp 
ratio go down. However, as it was the case in Section 2.4.4 when general x was considered, the difference 
is more pronounced in high a regime (as it was the case when general x was considered, since rsocp+ is 
proportional to an the difference in rsocp+ is more pronounced in high a regime as well). 



4 Relating SOCP from (4) to LASSO algorithms 

In this section we briefly recall on a connection between the SOCP from (4) and ceratin LASSO algorithms 
that was established in [62] (we will recall on the connection only for general x; the connection for "signed" 
X is completely analogous). In [62] the following, rather abstract, algorithm was considered for recovering 
X in (3) 



min ||y — Ax\\2 

X 

subject to ||x||i < ||x||i. 



(244) 



If there is a priori available knowledge of ||x||i the above algorithm can be run and as shown in [62] it 
achieves the same generic (worst-case) norm-2 of the error vector as does the SOCP from (4) (of course 
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(a.B"*") curves as functions of p=||w IL/o, SOCP 
^ f^w' '^ " socp+"2 




0.2 

Figure 5: (a, /3+) curves as functions of p 

'fsocp+ = "'y i+p^ ^ 



0.4 0.6 

a 



0.8 1 

for the SOCP algorithm from (146) run with 



assuming that the SOCP is run with Vsocp)- We then went further in [62] and considered the following, more 
well-known, example from the class of LASSO algorithms 



mill ||y - Ax\\2 + Aiasso||x||i. 



(245) 



We argued further that there is a Xiasso in (245) such that the generic norm-2's of the error vectors obtained 
through (244) and (245) concentrate around the same point which is also the concentrating point of generic 

'^socp- 

As mentioned in [62] the connection presented above relates to a characterization of a particular perfor- 
mance measure of an SOCP algorithm (the same is of course true for the LASSO algorithms). How adequate 
is such a performance measure is whole another story that goes beyond the scope of the present paper and 
we will explore it in more detail elsewhere. 

5 Discussion 

In this paper we considered "noisy" under-determined systems of linear equations with sparse solutions. 
We looked from a theoretical point of view at polynomial-time second-order cone programming (SOCP) 
algorithms. Under the assumption that the system matrix A has i.i.d. standard normal components, we 
created a general framework that can be used to characterize various quantities of interest in analyzing the 
SOCP's performance. Among other things, the framework enables one to precisely estimate the norm of the 
error vector in "noisy" under-determined systems. Moreover, it can do so for any given A;-sparse vector x. 

To demonstrate the power of the framework we considered what we referred to as the SOCP's generic 
performance. We established the precise values of the "worst-case" norm-2 of the error vector. On the other 
hand, using the framework one can create a massive set of results related to the SOCP's non-generic or as we 
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(a.B''") curves as functions of p=||w 11 /o 
^ "^w' ^ " socp+"2 




0.2 0.4 0.6 0.8 1 

oc 

Figure 6: Deviation of (a, /3^ ) curves; solid lines are for the SOCP from (146) run with rgocpA 
dashed lines are for the SOCP from (146) run with rsocp+ = cr^/an 



a 



i+p- 



^n; 



will refer to it problem dependent performance. This though is beyond the scope of an introductory paper 
and will be pursued further in one of the forthcoming papers. 

As for the applications, further developments are pretty much unlimited (this is essentially the same 
conclusion one can make for the analysis of the LASSO algorithms presented in [62]). Any problem that 
can be solved in the so-called noiseless case (and there is hardly any that can not) through the mechanisms 
developed in [65] and [64] can now be handled in the noisy case as well. For example, quantifying per- 
formance of SOCP or LASSO optimization problems in solving "noisy" systems with special structure of 
the solution vector (block-sparse, binary, box-constrained, low-rank matrix, partially known locations of 
nonzero components, just to name a few), "noisy" systems with noisy (or approximately sparse)) solution 
vectors can then easily be handled to an ultimate precision. In a series of forthcoming papers we will present 
some of these applications. 



References 

[1] R. Adamczak, A. E. Litvak, A. Pajor, and N. Tomczak-Jaegermann. Restricted isometry property of 
matrices with independent columns and neighborly polytopes by random sampling. Preprint, 2009. 
available at arXiv:0904.4723. 

[2] M. Akcakaya and V. Tarokh. A frame construction and a universal distortion bound for sparse repre- 
sentations. IEEE Trans, on Signal Processing, 56(6), June 2008. 

[3] M. S. Asif and J. Romberg. On the lasso and dantzig selector equivalence. 44th Annual Conference on 
Information Sciences and Systems (CISS), pages 1-6, March 2010. 



61 



[4] R. Baraniuk, V. Cevher, M. Duarte, and C. Hegde. Model-based compressive sensing, available online 
at http://www.dsp.ece.rice.edu/cs/. 

[5] R. Baraniuk, M. Davenport, R. DeVore, and M. Wakin. A simple proof of the restricted isometry 
property for random matrices. Constructive Approximation, 28(3), 2008. 

[6] M. Bayati and A. Montanari. The lasso risk of gaussian matrices. Preprint, available online at 
arXiv:1008.2581. 

[7] R. Berinde, A. C. Gilbert, R Indyk, H. Karloff, and M. J. Strauss. Combining geometry 
and combinatorics: A unified approach to sparse signal recovery. 2008. available online at 
http://www.dsp.ece.rice.edu/cs/. 

[8] R J. Bickel, Y. Ritov, and A. B. Tsybakov. Simultaneous analysis of lasso and dantzig selector. The 
Annals of Statistics, 37(4): 1705-1732, 2009. 

[9] F. Bunea, A. B. Tsybakov, and M. H. Wegkamp. Sparsity oracle inequalities for the lasso. Electronic 
Journal of Statistics, 1:169-194, 2007. 

[10] E. Candes. Compressive sampling. Proc. International Congress of Mathematics, pages 1433-1452, 
2006. 

[11] E. Candes. The restricted isometry property and its implications for compressed sensing. Compte 
Rendus de I'Academie des Sciences, Paris, Series I, 346, pages 589-59, 2008. 

[12] E. Candes, J. Romberg, and T Tao. Robust uncertainty principles: exact signal reconstruction from 
highly incomplete frequency information. IEEE Trans, on Information Theory, 52:489-509, December 
2006. 

[13] E. Candes, J. Romberg, and T Tao. Stable signal recovery from incomplete and inaccurate measure- 
ments. Comm. PureAppl. Math., 59:1207-1223, 2006. 

[14] E. Candes and T. Tao. Decoding by linear programming. IEEE Trans, on Information Theory, 51:4203- 
4215, Dec. 2005. 

[15] E. Candes, M. Wakin, and S. Boyd. Enhancing sparsity by reweighted 11 minimization. /. Fourier 
Anal. Appl, 14:877-905, 2008. 

[16] E. Cands and T. Tao. The dantzig selector: statistical estimation when p is much larger than n. Ann. 
Statist., 35(6):23 13-2351, 2007. 

[17] S.S. Chen and D. Donoho. Examples of basis pursuit. Proceeding of wavelet applications in signal 
and image processing III, 1995. 

[18] S.S. Chen, D. L. Donoho, and M. A. Saunders. Atomic decomposition by basis pursuit. SIAM, Journal 
on Scientific Computing, 20:33-61, 1998. 

[19] S. Chretien. An alternating ell-1 approach to the compressed sensing problem. 2008. available online 
at http://www.dsp.ece.rice.edu/cs/. 

[20] B. S. Cirelson, I. A. Ibragimov, and V. N. Sudakov. Norms of gaussian sample functions. Lect. Notes 
Math., 50, 1976. 



62 



[21] G. Cormode and S. Muthukrishnan. Combinatorial algorithms for compressed sensing. SIROCCO, 
13th Colloquium on Structural Information and Communication Complexity, pages 280-294, 2006. 

[22] S. F. Cotter and B. D. Rao. Sparse channel estimation via matching pursuit with application to equal- 
ization. IEEE Trans, on Communications, 50(3), 2002. 

[23] W. Dai and O. Milenkovic. Subspace pursuit for compressive sensing signal reconstruction. Preprint, 
page available at arXiv:0803.0811, March 2008. 

[24] W. Dai and O. Milenkovic. Weighted superimposed codes and constrained integer compressed sensing. 
IEEE Trans, on Information Theory, 55(9):22 15-2219, September 2009. 

[25] M. E. Davies and R. Gribonval. Restricted isometry constants where ell-p sparse recovery can fail for 
< p < 1. available online at http://www.dsp.ece.rice.edu/cs/. 

[26] D. Donoho. High-dimensional centrally symmetric polytopes with neighborlines proportional to di- 
mension. Disc. Comput. Geometry, 35(4):617-652, 2006. 

[27] D. Donoho, A. Maleki, and A. Montanari. The noise-sensitiviy thase transition in compressed sensing. 
Preprint, Apr. 2010. available on arXiv. 

[28] D. Donoho and J. Tanner. Neighborliness of randomly-projected simplices in high dimensions. Proc. 
National Academy of Sciences, 102(27):9452-9457, 2005. 

[29] D. L. Donoho, M. Elad, and V. Temlyakov. Stable recovery of sparse overcomplete representations in 
the presence of noise. IEEE Transactions on Information Theory, 52(1):6-18, Jan 2006. 

[30] D. L. Donoho, Y. Tsaig, I. Drori, and J.L. Starck. Sparse solution of underdetermined linear equations 
by stagewise orthogonal matching pursuit. 2007. available online at http://www.dsp.ece.rice.edu/cs/. 

[31] M. Duarte, M. Davenport, D. Takhar, J. Laska, T. Sun, K. Kelly, and R. Baraniuk. Single-pixel imaging 
via compressive sampling. IEEE Signal Processing Magazine, 25(2), 2008. 

[32] B. Efron, T Hastie, and R. Tibshirani. Discussion: The dantzig selector: statistical estimation when p 
is much larger than n. Ann. Statist., 35(6):2358-2364, 2007. 

[33] S. Foucart and M. J. Lai. Sparsest solutions of underdetermined linear systems via ell-q minimization 
for < g < 1. available online at http://www.dsp.ece.rice.edu/cs/. 

[34] M. R Friedlander and M. A. Saunders. Discussion: The dantzig selector: statistical estimation when p 
is much larger than n. Ann. Statist., 35(6):2385-2391, 2007. 

[35] A. Gilbert, M. J. Strauss, J. A. Tropp, and R. Vershynin. Algorithmic linear dimension reduction in 
the 11 norm for sparse vectors. 44th Annual Allerton Conference on Communication, Control, and 
Computing, 2006. 

[36] A. Gilbert, M. J. Strauss, J. A. Tropp, and R. Vershynin. One sketch for all: fast algorithms for 
compressed sensing. ACM STOC, pages 237-246, 2007. 

[37] Y. Gordon. On Milman's inequality and random subspaces which escape through a mesh in R^. 
Geometric Aspect of of functional analysis, Isr Semin. 1986-87, Lect. Notes Math, 1317, 1988. 

[38] R. Gribonval and M. Nielsen. Sparse representations in unions of bases. IEEE Trans. Inform. Theory, 
49(12):3320-3325, December 2003. 

63 



[39] R. Gribonval and M. Nielsen. On the strong uniqueness of highly sparse expansions from redundant 
dictionaries. In Proc. Int Conf. Independent Component Analysis (ICA'04), LNCS. Springer- Verlag, 
September 2004. 

[40] R. Gribonval and M. Nielsen. Highly sparse representations from dictionaries are unique and indepen- 
dent of the sparseness measure. Appl. Comput. Harm. Anal, 22(3):335-355, May 2007. 

[41] J. Haupt and R. Nowak. Signal reconstruction from noisy random projections. IEEE Trans. Information 
Theory, pages 4036-4048, September 2006. 

[42] R Indyk and M. Ruzic. Fast and effective sparse recovery using sparse random matrices. 2008. 
avialable on arxiv. 

[43] S. Jafarpour, W. Xu, B. Hassibi, and R. Calderbank. Efficient compressed sensing using high-quality 
expander graphs, available online at http://www.dsp.ece.rice.edu/cs/. 

[44] G. James, R Radchenko, and J. Lv. Dasso. Dasso: Connections between the dantzig selector and lasso. 
/. Roy Statist. Soc. Ser B, 71:127142, 2009. 

[45] V. Koltchinskii. The dantzig selector and sparsity oracle inequalities. Bernoulli, 15(3):799-828, 2009. 

[46] J. Mairal, F. Bach, J. Ponce, Guillermo Sapiro, and A. Zisserman. Discriminative learned dictionaries 
for local image analysis. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2008. 

[47] I. Maravic and M. Vetterli. Sampling and reconstruction of signals with finite rate of innovation in the 
presence of noise. IEEE Trans, on Signal Processing, 53(8):2788-2805, August 2005. 

[48] N. Meinshausen, G. Rocha, and B. Yu. Discussion: A tale of three cousins: Lasso, 12boosting and 
dantzig. Ann. Statist., 35(6):2373-2384, 2007. 

[49] N. Meinshausen and B. Yu. Lasso-type recovery of sparse representations for high-dimensional data. 
Ann. Statist., 37(1):246270, 2009. 

[50] O. Milenkovic, R. Baraniuk, and T. Simunic-Rosing. Compressed sensing meets bionformatics: a new 
DNA microarray architecture. Information Theory and Applications Workshop, 2007. 

[51] D. Needell and J. A. Tropp. CoSaMP: Iterative signal recovery from incomplete and inaccurate sam- 
ples. Applied and Computational Harmonic Analysis, 26(3):301-321, 2009. 

[52] D. Needell and R. Vershynin. Unifrom uncertainly principles and signal recovery via regularized 
orthogonal matching pursuit. Foundations of Computational Mathematics, 9(3):317-334, 2009. 

[53] F. Parvaresh and B. Hassibi. Explicit measurements with almost optimal thresholds for compressed 
sensing. IEEE ICASSP, Mar-Apr 2008. 

[54] F. Parvaresh, H. Vikalo, S. Misra, and B. Hassibi. Recovering sparse signals using sparse measure- 
ment matrices in compressed dna microarrays. IEEE Journal of Selected Topics in Signal Processing, 
2(3):275-285, June 2008. 

[55] G. Pisier. Probabilistic methods in the geometry of banach spaces. Springer Lecture Notes, 1206, 
1986. 

[56] B. Recht, M. Fazel, and P. A. Parrilo. Guaranteed minimum-rank solution of linear matrix equations 
via nuclear norm minimization. 2007. available online at http://www.dsp.ece.rice.edu/cs/. 

64 



[57] F. Rodriguez and G. Sapiro. Sparse representations for image classification: Learning 
discriminative and reconstructive non-parametric dictionaries. 2008. available online at 
http://www.dsp.ece.rice.edu/cs/. 

[58] J. Romberg. Imaging via compressive sampling. IEEE Signal Processing Magazine, 25(2): 14-20, 
2008. 

[59] M. Rudelson and R. Vershynin. Geometric approach to error correcting codes and reconstruction of 
signals. International Mathematical Research Notices, 64:4019 - 4041, 2005. 

[60] R. Saab, R. Chartrand, and O. Yilmaz. Stable sparse approximation via nonconvex optimization. 
ICASSP, IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Apr. 2008. 

[61] V. Saligrama and M. Zhao. Thresholded basis pursuit: Quantizing linear programming solutions for 
optimal support recovery and approximation in compressed sensing. 2008. available on arxiv. 

[62] M. Stojnic. A framework for perfromance characterization of LASSO algortihms. available at arXiv. 

[63] M. Stojnic. A rigorous geometry-probability equivalence in characterization of ^i -optimization, avail- 
able at arXiv. 

[64] M. Stojnic. Upper-bounding ^i -optimization weak thresholds, available at arXiv. 

[65] M. Stojnic. Various thresholds for £i -optimization in compressed sensing, submitted to IEEE Trans, 
on Information Theory, 2009. available at arXiv:0907.3666. 

[66] R. Tibshirani. Regression shrinkage and selection with the lasso. J. Royal Statistic. Society, B 58:267- 
288, 1996. 

[67] J. Tropp. Just relax: Convex programming methods for identifying sparse signals in noise. IEEE 
Transactions on Information Theory, 52(3):1030-1051, March 2006. 

[68] J. Tropp and A. Gilbert. Signal recovery from random measurements via orthogonal matching pursuit. 
IEEE Trans, on Information Theory, 53(12):4655^666, 2007. 

[69] J. A. Tropp. Greed is good: algorithmic results for sparse approximations. IEEE Trans, on Information 
Theory, 50(10):223 1-2242, 2004. 

[70] S. van de Geer. High-dimensional generalized linear models and the lasso. Ann. Statist., 36(2):614— 
645, 2008. 

[71] H. Vikalo, F. Parvaresh, and B. Hassibi. On sparse recovery of compressed dna microarrays. Asilomor 
conference, November 2007. 

[72] M. J. Wainwright. Sharp thresholds for high-dimensional and noisy recovery of sparsity. Proc. Allerton 
Conference on Communication, Control, and Computing, September 2006. 

[73] J. Wright and Y. Ma. Dense error correction via ell-1 minimization. available online at 
http://www.dsp.ece.rice.edu/cs/. 

[74] W. Xu and B. Hassibi. Efficient compressive sensing with determinstic guarantees using expander 
graphs. IEEE Information Theory Workshop, September 2007. 



65 



